Mind2Web API
Mind2Web Description
Mind2Web is a dataset for developing and evaluating generalist agents for the web that can follow language instructions to complete complex tasks on websites. The Mind2Web paper can be found here.
Note that this API only contains the Mind2Web Training data. We do not include the Test data to avoid data contamination with data crawlers for training LLMs.
Mind2Web API Documentation
Introduction
This API serves the Mind2Web engine which provides endpoints to access partial and full datasets.
Version: 0.0.9 (Experimental)
Status: Development
Rate Limit: 500 requests per minute.
Root Endpoint: https://api.junglegym.ai/
Endpoints
1. Root Test Endpoint
Method:
GET
Rate Limit: 500/minute.
Response: A welcome message.
jsonCopy code{
"message": "Hello World from the JungleGym dataset server!"
}
2. Load Light Train Dataset
URL:
/load_light_train_dataset
Method:
GET
Rate Limit: 500/minute.
Response: Returns light training dataset.
jsonCopy code{
"data": [...]
}
3. Load Full Train Dataset
URL:
/load_full_train_dataset
Method:
GET
Rate Limit: 500/minute.
Response: Returns full training dataset.
jsonCopy code{
"data": [...]
}
4. Get List of Actions
URL:
/get_list_of_actions?annotation_id=<annotation_id>
Method:
GET
Rate Limit: 500/minute.
Response: Returns actions and their representations for a given annotation ID.
jsonCopy code{
"actions": [...],
"action_reprs": [...]
}
5. Get Raw JSON Screenshots
URL:
/get_raw_json_screenshots?annotation_id=<annotation_id>
Method:
GET
Rate Limit: 500/minute.
Response: Returns raw JSON screenshots for a given annotation ID.
jsonCopy code{
"data": {...}
}
6. Get Raw DOM Content
URL:
/get_raw_dom_content?annotation_id=<annotation_id>
Method:
GET
Rate Limit: 500/minute.
Response: Returns raw DOM content for a given annotation ID.
jsonCopy code{
"data": {...}
}
7. Get Storage
URL:
/get_storage?annotation_id=<annotation_id>
Method:
GET
Rate Limit: 500/minute.
Response: Returns storage data for a given annotation ID.
jsonCopy code{
"data": {...}
}
8. Get Raw Trace Zip
URL:
/get_raw_trace_zip?annotation_id=<annotation_id>
Method:
GET
Rate Limit: 500/minute.
Response: Returns a trace.zip file for the given annotation ID.
Errors
The API will return specific HTTP status codes for different kinds of errors:
401 Unauthorized
: For forbidden or unauthorized access.404 Not Found
: If the requested resource or data is not found.500 Internal Server Error
: For any internal server issues.
Make sure to check the detail
field in the response for a specific error message.
Last updated