-
Guide
Tags
API
What's new
guide
- Introduction
- FAQ
- Vocabulary
Platform
- Projects
- Import & Export
- Data Management
- Labeling Configuration
- Labeling Interface
- Machine Learning
Process
- Statistics
- Machine Learning Backends
- Verify and Monitor Quality
People
- User Accounts
- Guide for Annotators
- Organizations
- Teams
Various
- Activity Log
- JavaScript SDK
- Embed Annotation
- On-Premise Setup
- On-Premise Usage
Data Management
This documentation describes Heartex platform version 1.0.0, which is no longer supported. For information about managing data in Label Studio Enterprise Edition, the equivalent of Heartex platform version 2.0.x, see Label and annotate data.
The project Data Manager page provides a large number of functions for task management and quality control. You can switch to Quick view mode to perform the task exploration faster.
Enter to Label Stream and start the labeling corresponding to the task sampling order | |
Enter to Verify Stream to rate annotator completions | |
Import tasks | |
Export completions & results | |
Ground Truth Manager | |
Switch to Quick View / Full View mode using this buttons: in Quick mode you can move among tasks quickly, in Full mode you can see all the statistics and task statuses. |
|
Clicking on the pencil button opens up the editor with the selected task, completions and ML predictions from all annotators. Use ctrl + click in Quick mode to open the editor in the new tab | |
Delete task from project |
Label Stream
Tasks are shown by project sampling order in Label Stream mode. This mode is very similar to the annotator labeling work. Completion panel is hidden.
Verify Stream
Verify Stream can be used to flag correct and mistaken completions. You will be prompted to thumb up / thumb down all the project completions by random order. All flags could be found on the data manager page in Review column.
Import Tasks
You can import data through our API or by uploading the JSON/CSV/TSV/ZIP/RAR files. You can always import more data in Data Manager. All text/hypertext resources can be included in tasks directly, they will be hosted on our servers, and advanced hosting is not necessary. For images, audio, time-series, video and other BLOBs you need to use external hosting with https/https links or S3 storage.
Read more about Import format.
In «Add more data» dialog you can download sample files for import or add sample task in one click
Don’t forget about CORS settings when importing tasks. It must be allowed on external hosting and properly configured. Otherwise task data sources won’t be loaded.
Export Completions
So you have been working hard labeling your data and have accumulated a respectable amount. How do you get the data out of the application and onto your computer? The platform provides an export function for this.
The export results are in JSON format. It could be used for Import in Heartex Projects again because of import & export formats are the same (just enable «Include full task descriptions» option). We also support the export at API level.
Aggregation of completions
If you setup «Overlap of completion» for Collaborators in the project settings more than 1 then tasks will have multiple completions. In this case majority vote aggregation can be helpful and it will merge all completions to one.
Include full task descriptions
This option will include full body of tasks to the exported file. Use this option if you want to import the exported file of this project to another project with the proper labeling config.
Include predictions
Include ML predictions, it will be presented as “predictions” array for each task in output JSON.
Tips
If your project has no data labeled and you don’t enable «Include full task descriptions» option, then the download button does nothing and returns empty results.
If your project is not using a model or the requirements for a model have not yet been met, then the downloaded results will only include hand-created labels.
If your project has a ML model, then the downloaded results will include both manual labels and model-assisted labels.
Sometimes the export operation can be long depending on the completion number, so you can start the export and reload data manager page: all your export history will be saved in «Last exports».
Ground Truth Completions
Ground Truth (GT) completions are special items which can be used for:
- annotator statistics evaluation relatively to GT completions
- machine learning accuracy evaluation relatively to GT completions
- retrain ML model including/excluding all GT completions
You can make GTs with several ways:
mark a completion as GT in task explore mode using star icon:
mark a completion in the data manager table:
In this case completion will be selected in priorities of
1 project owner
2 other annotatorsimport completions marked as GT with tasks
Use GT Manager for the batch marking
Ground Truth Manager
Ground Truth (GT) Manager is a fast way to mark multiple completions as GTs.
Press the green button with star on the data manager page.
The first step is to set the completion filter. The second step is to set the percent of fraction which will be marked as GTs. Also you are able to reset all GT completions to regular completions here.
Filters
Filters implement a classical way to find tasks you are looking for. Combinations of filters are working in intersection mode, e.g.: you can find all completions containing class name Cat
completed by annotator heartex@heartex.ai
.
The result counters show actual for current page statistics and it will be updated after page reloading in the right mini “Found” panel near “Filters” button. All the icons have tooltips with names.
- Task data: filter by substring in specified field
- Completion results: filter by classes, types of labeling tags and any other information from “JSON as text” representation of
completion.result
- Prediction results: very similar to completion results
- Collaborator: dropdown with the project annotators
- Outliers: find tasks with bad collaborator agreements (less than 33%) or high skipped rates (more 50%)
- Flagged regions: show only tasks where completions have flagged regions