Update to function as out-of-the-box test server by PGijsbers · Pull Request #13 · openml/services

PGijsbers · 2026-01-26T14:16:46Z

Updating routing and data of the images to allow an out of the box test server on a local machine.

Currently the updated configuration allows running of the openml-python unit tests that require the test server (see openml/openml-python#1630).

Have to cross-check I didn't break other functionality in the process.

NGINX is now also listens to port 8000 on the docker network. This is an important step to being able to start these `services` and have them function as a local test server for openml-python among others.

PGijsbers · 2026-01-30T15:31:02Z

config/database/update.sh


 # Update openml.expdb.dataset with the same url
 mysql -hdatabase -uroot -pok -e 'UPDATE openml_expdb.dataset DS, openml.file FL SET DS.url = FL.filepath WHERE DS.did = FL.id;'
-


These removed updates are now embedded in the state of the database on the new image

PGijsbers · 2026-01-30T15:33:30Z

config/evaluation-engine/run-cron.sh

+sed -i -E 's/^(::1\t)localhost (.*)$/\1\2/g' /etc/hosts.new
+cat /etc/hosts.new > /etc/hosts
+rm /etc/hosts.new
+


For other containers updating /etc/hosts through configuration was sufficient.
For this one, the pre-existing /etc/hosts took precidence, so it needed to be updated.

PGijsbers · 2026-01-30T15:34:46Z

docker-compose.yaml

+      - "8000:8000"
+    networks:
+      default:
+        ipv4_address: 172.28.0.2


the static ip address is required so that we can add entries to /etc/hosts file of other containers, so they contact nginx when they resolve localhost.

geetu040 · 2026-02-06T17:26:22Z

config/evaluation-engine/.env

@@ -1,4 +1,4 @@
-CONFIG=api_key=AD000000000000000000000000000000;server=http://php-api:80/
+CONFIG=api_key=abc;server=http://php-api:80/


I don't understand, here the api key is set from AD000000000000000000000000000000 to abc ...

AD000000000000000000000000000000 was the api key in the old test database image, but this has been changed to abc to match the test server database.

The evaluation engine needs administrator access currently.

geetu040 · 2026-02-06T17:26:27Z

config/python/config

+apikey=normaluser
+server=http://localhost:8000/api/v1/xml


... and here the api key is set from AD000000000000000000000000000000 to normaluser

So far, these were the keys for developers:

php-api (v1) test-server: normaluser php-api (v1) local-server: AD000000000000000000000000000000

has anything changed here?

Also what are the api keys for python-api (v2), now that it will also be added to services with a frozen docker image

This configuration is just for when you spin up a openml-python container to use the Python API. They do not need administrator access, so I changed the key to normaluser which is a normal read-write account.

The Python-based REST API uses the keys that are in the database. The server is unaffected, but I will need to update the keys that are used in its tests.

josvandervelde

Looking good! I encountered some problems when using python to connect to the local running containers.

josvandervelde · 2026-02-13T14:19:49Z

docker-compose.yaml

  minio:
    profiles: ["all", "minio", "evaluation-engine"]
-    image: openml/test-minio:v0.1.20241110
+    image: openml/test-minio:v0.1.20260204


This minio contains most parquet files out of the box, but not all!

bash-5.1# ls /data/datasets/0000/0001 dataset_1.pq phpFsFYVN bash-5.1# ls /data/datasets/0000/0128 iris.arff

This is probably a mistake?

Also, it contains some weird files:

bash-5.1# ls /data/datasets/0000 0000 '0000?C=S;O=A' '0000?C=D;O=A' '0000?C=M;O=A' '0000?C=N;O=D' ....

Apparently the weird files are apache: https://httpd.apache.org/docs/2.4/mod/mod_autoindex.html
Harmless, but I'll update the wget command to exclude them.

The omission of 128 was accidental, but turned out to be useful for the openml-python API tests that require an arff file (which isn't easily downloaded anymore if parquet files are present). I will hold off on adding that parquet file because:

I would need to update openml-python (or at least its tests) accordingly

Services should be able to handle a missing parquet file for now, as not all datasets have parquet files in production either

As for the reason it was skipped.. that's worth looking into. For now, I'll add a note to the readme.

README.md

test.sh

josvandervelde · 2026-02-13T15:21:41Z

README.md

 my_task = openml.tasks.get_task(my_task.task_id)
 from sklearn import compose, ensemble, impute, neighbors, preprocessing, pipeline, tree
 clf = tree.DecisionTreeClassifier()
 run = openml.runs.run_model_on_task(clf, my_task)


I get errors here:

OSError: Repetition level histogram size mismatch on

Traceback (most recent call last): File "/openml/openml/datasets/dataset.py", line 593, in _parse_data_from_pq data = pd.read_parquet(data_file)

It seems to have something to do with the pyarrow version in openml-python. Maybe unrelated to this PR, but I haven't seen these problems before. Do you see these problems as well?

Yes. I had sent a message on Slack about it. Basically, the openml-python image is so outdated the newly generated parquet files cannot be loaded. If you take the shell as an entrypoint and first update pyarrow and pandas, it works fine.

PGijsbers added 6 commits January 22, 2026 15:52

Bump REST API version

c4d4927

Update routing to same location for internal and external requests

d1673a0

NGINX is now also listens to port 8000 on the docker network. This is an important step to being able to start these `services` and have them function as a local test server for openml-python among others.

Update configuration to work with local database and localhost

456586a

Use a more reliable way to overwrite the /etc/hosts file

4ab1036

Remove commented out code

273d8e9

Point to most recent images

7ac2879

PGijsbers mentioned this pull request Jan 30, 2026

[ENH] Allow using a local test server openml/openml-python#1630

Open

PGijsbers commented Jan 30, 2026

View reviewed changes

PGijsbers added 5 commits February 4, 2026 17:10

Point to updated versions of the image which are built cross platform

b2e71be

Note on emulation of ES on arm mac

0b185d8

Merge branch 'main' into setup-test-locally

582b732

Bump frontend version

eeaa47b

Add a shell script for automated testing of some of the services

73baa3e

PGijsbers changed the title ~~[WIP] Update to function as out-of-the-box test server~~ Update to function as out-of-the-box test server Feb 5, 2026

PGijsbers marked this pull request as ready for review February 5, 2026 15:25

PGijsbers added 4 commits February 6, 2026 13:29

Add minio prefix to dataset path

99f2e0a

Test file content to avoid false positives (except croissant)

848c73f

Check croissant file content

a2e5844

Update configurations for local setup

5515006

geetu040 reviewed Feb 6, 2026

View reviewed changes

satvshr mentioned this pull request Feb 10, 2026

[MNT] Dockerized tests for CI runs using localhost openml/openml-python#1629

Open

josvandervelde mentioned this pull request Feb 13, 2026

Downloading data does not work from the local website #2

Closed

josvandervelde reviewed Feb 13, 2026

View reviewed changes

Ensure that the query parameters are forwarded too

1a0cf72


		# Update openml.expdb.dataset with the same url
		mysql -hdatabase -uroot -pok -e 'UPDATE openml_expdb.dataset DS, openml.file FL SET DS.url = FL.filepath WHERE DS.did = FL.id;'

		@@ -1,4 +1,4 @@
		CONFIG=api_key=AD000000000000000000000000000000;server=http://php-api:80/
		CONFIG=api_key=abc;server=http://php-api:80/

		apikey=normaluser
		server=http://localhost:8000/api/v1/xml

Uh oh!

Conversation

PGijsbers commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

PGijsbers Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

josvandervelde left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

PGijsbers commented Jan 26, 2026 •

edited

Loading

PGijsbers Jan 30, 2026 •

edited

Loading