Import & Export

This documentation describes Heartex platform version 1.0.0, which is no longer supported. For information about importing data in Label Studio Enterprise Edition, the equivalent of Heartex platform version 2.0.x, see Get data into Label Studio.

How to import tasks?

  1. Prepare you tasks in format described below.
  2. Go to Project - Data Manager - Add more data dialog.

Read more about Import and Export usage through UI in Data Manager.

Task format

Import and export formats are the same.
The platform stores the JSON-formatted list of tasks.
Each task is a dictionary-like structure, with some specific keys reserved for internal use:

You may find more extended information about task and completion structure in Import API.

External resources & BLOBs

Images, audio, video, and other external files must be uploaded to any hosting with the http/https access. Your JSON/CSV/TSV task files must contain proper http/https URLs to them. Let’s prepare tasks with images for import as example:

  1. Upload files to any hosting or serve it locally with any web-server.

  2. Copy http/https links to your images

  3. Create tasks.json like this:

    [{
      "image_source": "http://example.com/test1.jpg" 
    },
    {
      "image_source": "http://example.com/test2.jpg" 
    }]
  4. Go to Add more data dialog and select the prepared file.

Don’t forget about CORS settings when importing tasks. It must be allowed on external hosting and properly configured. Otherwise task data sources won’t be loaded.

Another option to import external resources is to use Cloud Storages.

Example

Here is an example of a config and tasks list composed of one element, for text classification project:

<View>
  <Text name="message" value="$my_text"/>
  <Choices name="sentiment_class" toName="message">
    <Choice value="Positive"/>
    <Choice value="Neutral"/>
    <Choice value="Negative"/>
  </Choices>
</View>
[{
  # "id" is a reserved field, avoid using it when importing tasks
  "id": 123,

  # "data" requires to contain "my_text" field defined by labeling config,
  # and can optionally include other fields
  "data": {
    "my_text": "Opossum is great",
    "ref_id": 456,
    "meta_info": {
      "timestamp": "2020-03-09 18:15:28.212882",
      "location": "North Pole"
    } 
  },

  # completions are the list of annotation results matched labeling config schema
  "completions": [{
    "result": [{
      "from_name": "sentiment_class",
      "to_name": "message",
      "type": "choices",
      "value": {
        "choices": ["Positive"]
      }
    }]
  }],

  # "predictions" are pretty similar to "completions" 
  # except that they also include some ML related fields like prediction "score"
  "predictions": [{
    "result": [{
      "from_name": "sentiment_class",
      "to_name": "message",
      "type": "choices",
      "value": {
        "choices": ["Neutral"]
      }
    }],
    # score is used for active learning sampling mode
    "score": 0.95
  }]
}]

Import file types

You can download example of the import file for your project in any supported format on Add more data dialog. One file is limited with 250k tasks and 200 MB size.

Supported image formats

.png .jpg .jpeg .tiff .bmp .gif

Supported audio formats

.wav .aiff .mp3 .au .flac

Quick API overview

Read more about task import API and full task API section.

Import tasks API

curl -H 'Content-Type: application/json' -H 'Authorization: Token abc123' \
-X POST 'https://app.heartex.ai/api/projects/1/tasks/bulk/' --data @my_file.csv

where my_file.csv is

[{
  "data": {
    "my_image_url": "https://app.heartex.ai/static/samples/kittens1.jpg"
  }
}, {
  "data": {
    "my_image_url": "https://app.heartex.ai/static/samples/kittens2.jpg"
}}]

Retrieve task API

The task format could be viewed by following this link in your browser (change <task_id> for the real task ID, e.g. 2353):

curl https://app.heartex.ai/api/tasks/<task_id>/

The following format specifies a Task:

{
  "id": 2353,
  "data": {
    "my_image_url": "https://app.heartex.ai/static/samples/kittens.jpg"
  },
  "accuracy": 0.0,
  "created_at": "2019-02-04T20:33:51.361394Z",
  "updated_at": "2019-02-04T20:33:51.361430Z",
  "is_labeled": false,
  "project": 2
}

Export results API

You can use an API to request a file with exported results.
Read more in API.

Import from Cloud Storage

It is possible to import your data directly from a cloud storage (e.g. AWS S3 bucket). Read more in Cloud Storages section.

Export to common formats

You can optionally convert and export json raw completions to a more common formats by applying an open source converter tool.
The following export formats are available depending on a chosen annotation type: