NLP Summarization: Synthetic Text Message Collection for Machine Learning
A leading technology company was looking to enhance machine learning ability to recognize and replicate conversational text messages. They were looking to create a high volume and high-quality synthetic text message datasets in Arabic, Mandarin, English, French, German, Italian, Japanese, Korean, Portuguese, Russian, and Spanish.
How we helped:
Pactera EDGE delivered over 1M high-quality synthetic text messages across 11 languages spanning America, Europe, and Asia using its advanced AI and Machine Learning solution. Our team worked with PowerApps and PowerBI teams to enable NLP to help citizen developers to build their products.
- Pactera EDGE tapped into its global partners and almost 600,000+ crowd resource pool to collect and deliver synthetic crowd-made text messages.
- A high volume of synthetic text messages was received in a very short period, so the team had to quickly adjust to yield and maintain a consistent level of data diversity and quality.
- Pactera EDGE engaged QA experts in each language to provide optimum high-quality deliverables.
- Our team evaluated AI-generated rephrase against actual user query (match intension 1, not matching 0), and provided suggested rephrase to train AI.
- Delivered AI engine optimization insights from solution learnings.