Can Labeling be at the center of an AI development lifecycle strategy?
Well, my team of students are building out an object identification project using a labeling centered approach for a Computer Vision (CV) Pipeline. It is a 4 weeks full online project with a team of global multi-disciplinary professionals as a mini LiveLab.
We are using a set of images from residential food delivery to label bags to build out an object identification model but focusing on labeling to solve a business problem and eliminate bias in labeling stage of a CV Pipeline.
Here's a quick summary of what to expect in this project.
This project is about giving you the multi-disciplinary team experience of industry that we practice in our liv lab course to build out an AI model focused on labeling as a business function using a scoped object identification project.
Goals: Students will get
1. An intro to a Computer Vision (CV) pipeline
2. Experience of a quick datascience sprint (AI lifecycle)
3. Confidence into data and labeling as a business function.
Here's the week-by-week plan and time expectations
Week1: Kickoff & Business Problem
Identify stakeholders and arrive at a business problem and short-list 5 business use cases possible with the dataset provided from food delivery drop-off images.
Get to know your teammates and identify your role in this team
Get access to Playment labeling platform and look at the data. Find a business problem that this data cannot answer.
Week 2: Problem Statement
Take the business problem and arrive at a problem statement. (Sudha to provide lesson material for this). Do this as a team brainstorm.
Look at the data to ensure that you can solve the business problem you have shortlisted. You will have to do feature engineering to fill the gap with more data. Since this is a CV (Computer vision) project, you will do feature extraction (Sudha to provide lesson and pointers).
For the labeling platform, collect ground truth data (data labelled correctly) to use for validation of labeled/annotated dataset. Also set quality evaluation metrics for labelled data.
Week 3: Bias and Ethics
Identify bias in the data and any unethical use cases you want to avoid. Think of bias as leaving some stakeholder or segment of user behind. Check your labeled data if you have any labeling bias.
Think of bias mitigation
Identify your labeling QC (quality check) (Sudha to provide guidance)
Week4: Build model and QC
Data scientist can build a basic model and test the data for labeling quality and performance of solving the business problem.
Document/demo at weeklywed (date in end August to be chosen by the team)
Time Required:
Weekly 1 hour for lessons and updates to team/Sudha
Weekly 1 hours on lab with labeling, feature extraction, QC check (asynchronous, co-ordinate with team for live sync up). You'll need extra 1 hour time to learn the topic lessons and do your part and communication with your team.
Final demo and documentation - 2 hours total
Data science role may invest more time to test out model at every stage and try different tools.
Deliverables:
CV Pipeline Plan (document/deck)
QC metrics and quality check results
Business problem/use case problem statement)
a Pilot CV model (tentative plan to be built using Yolo5 under the discretion of data scientist to decide the relevant tools/models)
Good Luck Students! I am excited to see what you come up with.
Remember, I am here to help.
Remember to subscribe to keep this for future reference and share with your friends if they will benefit from this.