Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 51 Next »

In this example, you create a text labeling project to analyze the sentiment in a Tweet, and use Majority Vote for Labeling QA.

  File Modified
No files shared here yet.

Project Overview

You want to create a project that enables you to label the sentiment in a given Tweet post as “Positive”, “Negative”, “Neutral”, or “Not Sure.”

To create this project, you must perform the following tasks:

  1. List out your project requirements.

  2. Identify sample text that you can use for labeling.

  3. Configure project metadata.

  4. Manage your project input and output fields.

  5. Create the workflow that you want to implement in your project.

  6. Add users to your project and assign them project roles.

  7. Add a dataset to the project.

  8. Start labeling input text.

  9. View batch status reports.

This document explains how you can perform each of the tasks listed above. Specific sections in this document also contain sample data that you can use to easily create and implement this project on Taskmonk.

Listing Project Requirements

In this project, you want to:

  • Identify the sentiment type in a given Tweet based on four categories “Positive”, “Negative”, “Neutral”, and “Not Sure.”

Sample Input Data

To illustrate an example, we shall copy any Tweet text from Twitter.

To get started, select the tweet texts to be labeled and paste them into a Microsoft Excel sheet under the column labeled Tweet.

  1. Also, add a column labeled Tweet# to number multiple tweets.

2. Save the Microsoft Excel sheet as Tweet_Tagging_Input.xlsx on your hard drive.

Download Source File

Alternatively, you can download and use this file for your project: Tweet_tagging_input.xlsx.

Each downloadable file is available in the ZIP format. To use it, unzip its contents after downloading.

Configuring Project Metadata

Project Metadata is the first section you will be redirected to when you create a project. Here, provide the basic information such as the project name, process, and project type. You can also upload any help documentation.

  1. To create the project, click the Create Project floating button to the left side of the Projects page. The Project Metadata tab associated with your new project appears.

  2. Enter Twitter Post Analysis as the Project Name, Sentiment Identification as the Process Name, and Text Labeling as the Process.

  3. Select Text-Based for Project Type.

  4. As your project doesn't require lookup data, keep the Lookup checkbox unselected.

  5. Keep the Enable Project Pipeline option unselected which is the default setting.

  6. Click Next. The Documents tab is selected where all the Project Documents available are listed.

7. Click the UPLOAD DOCUMENT button to upload documents associated with the project. This is an optional step which will be skipped for now.

8. Click Next.

9. The Task Design section appears where you can manage your project input and output fields.

Managing Project Input and Output Fields

A project input field determines what is to be labeled (in this case, tweet post sentences and words) while an output field determines what values the labeling analysts should use for labeling (in this case, the sentiment types “Positive”, “Negative”, “Neutral”, or “Not Sure”). Your project can only uptake and output data associated with the input and output fields that you create here.

Taskmonk uses the project type that you specify to add input and/or output fields to projects as required. You can modify these later.

Here, Text-Based is selected as the Project Type. Taskmonk does not add any fields to the Task Design section.

Creating and Configuring the Input Fields

To create and configure the input fields, follow these steps:

  1. Either click the BROWSE INPUT FILE button to extract field names from the sample input file you’ve downloaded in the previous step, or

  2. Click the CREATE INPUT FIELD button to create the input fields explicitly.

  3. In this example, we’ve created two input fields, Sentence # and Word.

  4. Set Field Type to text, Mandatory to True, and Operational to True for both these fields.

Creating and Configuring the Output Fields

To create and configure the output field, follow these steps:

  1. Click the CREATE OUTPUT FIELD button.

  2. In the Create Output Field modal window, set the values for the following:

  • Name: Provide the output field name.

  • (Optional) Visible Name: The name that appears in the output file.

  • Set Data Type: Select from the listed options how the output field options will appear. In this case, select DropDown.

  • Possible Values: Type in comma-separated values that this output field can take. These values will be listed for selection in the dropdown list.

3. Select the All Levels check box to indicate that this field must be available to users at all execution levels, such as labeling, quality analysis, and so on.

4. Click Create to save the new output field and return to the Task Design > Output Fields tab.

5. Click Next to move to the next step, and click the Quality Workflow tab.

Creating Quality Workflows

The Quality Workflow section is where you can specify how output quality checks will be carried out and from which execution level to which execution level. In this example, we create the following user levels where Level 1 represents the lowest Majority Vote level and Level 3 represents the highest.

  • Twitter Sentiment Labeler: The labeling analyst representing execution Level 1.

  • Twitter Sentiment QA: The quality analyst representing execution Level 2.

  • Twitter Sentiment Project Lead: The labeling project lead representing execution Level 3.

  1. Here, we will opt for the Majority Vote workflow. To select this option, click the Execution Method field and select Majority Vote from the drop-down list.

  2. By default, Taskmonk creates the Analyst role for you. You can either delete this role name and click the Add Execution Level icon, or retain this role name, and edit it by clicking the Update Execution Level icon.

  3. In the Add Execution Level modal window, add the field values shown in the image below:

  4. Enter Twitter Sentiment Labeler1 as the Execution Level Name. By default, Level 1 is assigned to labelers where Skip Task can be Enabled or Disabled and Reject Task is Disabled automatically. You can choose to skip these field inputs if required.

  5. Click Add.

  6. Taskmonk adds the new execution level, and updates the Quality Workflow section. Repeat the above steps to add Quality Analyst at Level 2 and Twitter Sentiment Project Lead at Level 3.

  7. Click Next to move to the next step Execution.

  8. To add the User counts, Majority vote, and Consensus Percentage values, follow these steps:

    1. Set the User counts to 3 if every task has to be allocated to 3 different users. Majority vote 2 indicates that out of 3 users, if at least 2 of them provide the same response, the task completes at level 1. If all 3 users provide different responses, the task moves to level 2.

    2. Consensus Percentage: Percentage of total tasks imported that should be allocated to multiple users. The remaining tasks will follow the Maker-Checker or Maker-Editor model.

Creating the Process Logic

The Process Logic section lets you configure the logic based on which datasets are moved from one execution level to the next. For the current project, we enable the following rules:

  • In case of Majority Vote, do not create any rule from level 1 to level 2. All the tasks which do not have a majority output will automatically flow to level 2.

  • 40% of the project datasets move from level 2 to level 3.

To set the Process Logic, follow these steps:

  1. In the Process Logic section, click the +ADD PROCESS LOGIC button to create the logical flow based on which datasets are moved from one execution level to the next.

2. In the ADD PROCESS LOGIN section, select L1 [TWITTER SENTIMENT LABELER] from the dropdown list for From Level, and L2 [TWITTER SENTIMENT QA] for To Level.

3. Click the Add Rule icon to load the add rule UI.

4. Click the USE THE ADD BUTTON TO START ADDING RULES icon.

5. Click the + button for LEVEL1.

6. Click the Rule Type drop-down and select Percentage Rule.

7. Click the Percentage drop-down and select 40 %.

8. Select Project for Percentage Scope and click Submit.

9. Repeat the above steps to add any other rules by selecting the required options for Percentage Scope and Combinator. Your final Process Logic screen will look like the following:

10. Click Next to move on to the next step which is managing users & roles.

Managing Users and Roles

You must now add users to your project and assign the execution levels you just created to them.

  1. Click the Users tab just above Quality Workflow to view the Manage Users section.

  2. Click the Add button on the top-right.

  3. In the Add User window, select the users from the list for various levels (L1, L2, and L3).

  4. The minimum number of users in level 1 should be equal to the User counts set in the workflow.

  5. Click Add to add the selected users.

6. The users get added successfully to the Manage Users page.

Managing Project Datasets

Your project is now configured. Congratulations!

Before you can start labeling, you must upload the input files containing the raw data to be labeled.

  1. Click the Datasets tab. The Datasets page appears. Use this page to manage datasets for your project.

  2. Taskmonk organizes datasets into batches to simplify management and tracking. To add a new dataset, click Add Batch on the right side of the page. The Add Batch modal appears.

  3. Enter Batch 1 as the name for the batch that you want to import in the Add New Batch field. You can ignore the other fields.

     

     

  4. Click Submit. This creates a new batch of data for your project and adds it to the Pending tab of the Datasets page.

  5. You can now upload datasets into the batch, as required. To add a dataset to the batch, click the Import button under the Tasks(Import/Export) column. The Import Task modal appears.

6. Click Choose Files, select the sample input file from your computer and click Import.

7. Once the dataset is imported, click Close to exit the modal.

8. All the tasks get updated under the Pending section.

Labeling Text Using Taskmonk

Your project is now ready.

  1. Log in as Analyst and click the My Tasks icon at the top of the page. The My Tasks page appears.

2. Click the Get Tasks button adjacent to the Twitter Post Analysis project. The labeling UI associated with this project appears.

3. You can see the following project details in the labeling UI:

  • Batch Name (Batch 1) in the top-left section of the page.

  • Input fields (Text) on the top half of the page.

  • Output fields on the bottom half of the page.

    For detailed information on working with a typical labeling UI, see Labeling Data.

Viewing Batch Status Report

  1. Sign in as Manager, go to the Projects page and click Reports > View for the Twitter Post Analysis project. The Reports page appears.

  2. Click Dataset Progress Reports to view the batch status report. This shows the total number of tasks pending and completed at each level for all batches. The Disagreement score indicates the total number of tasks without a majority at level 1. In other words, the total tasks which moved to level 2.

This completes our example for Twitter Sentiment Identification using Majority Vote. Thank you!

Downloadable Sample Files

  File Modified
  • No labels