Submission Number: 191
Submission ID: 4361
Submission UUID: 075cb4fc-4a23-4388-a6d4-668a2d2f65e5
Submission URI: /form/project

Created: Mon, 02/12/2024 - 13:20
Completed: Mon, 02/12/2024 - 13:34
Changed: Wed, 09/04/2024 - 15:34

Remote IP address: 128.6.36.2
Submitted by: Udi Zelzion
Language: English

Is draft: No
Webform: Project
Project Title: Framework for Reduction of Ambiguity in Text Data from Generative AI
Program:
CAREERS (323)

Project Image: {Empty}
Tags:
ai (271), natural-language-processing (274), python (69)

Status: In Progress
Project Leader
--------------
Project Leader:
Jim Samuel

Email: jim.samuel@rutgers.edu
Mobile Phone: {Empty}
Work Phone: {Empty}

Project Personnel
-----------------
Mentor(s):
{Empty}

Student-facilitator(s):
Kushal Gunavantkumar Patel (27862)

Mentee(s):
{Empty}


Project Information
-------------------
Project Description:
Generative AI has invaded our places of work and learning with the promise of increasing productivity.
However, many generative AIs are built on (Large Language Model) LLMs which act as next-wordpredictors
based on probabilistic modeling. This leads to numerous challenges, especially ambiguity.
This proposal addresses the research question: How can we reduce ambiguity in AI generated text?
The current proposal seeks to 1) identify ways to algorithmically identify and flag ambiguity, and 2)
explore identifying levels of ambiguity and 3) explore ways in which ambiguity could be reduced or
managed.
Once ambiguity is identified, we intend to use a LLM application to generate improved alternatives. This
project will help improve the quality of human interactions with AI applications such as chatbots.

Project Information Subsection
------------------------------
Project Deliverables:
{Empty}

Project Deliverables:
{Empty}

Student Research Computing Facilitator Profile:
{Empty}

Mentee Research Computing Profile:
{Empty}

Student Facilitator Programming Skill Level: Practical applications
Mentee Programming Skill Level: {Empty}
Project Institution: {Empty}
Project Address:
{Empty}

Anchor Institution: CR-Rutgers
Preferred Start Date: {Empty}
Start as soon as possible.: Yes
Project Urgency: Already behind3Start date is flexible
Expected Project Duration (in months): 6
Launch Presentation: {Empty}
Launch Presentation Date: {Empty}
Wrap Presentation: {Empty}
Wrap Presentation Date: {Empty}
Project Milestones:
- Milestone Title: Launch 
  Milestone Description: Give a launch presentation during the monthly meeting get HPC access and explore and validate multiple LLMs on ready to use datasets
  Completion Date Goal: 2024-04-19
  Actual Completion Date: 2024-05-31
- Milestone Title: Identify ambiguity 
  Milestone Description: Narrow down on most promising approaches to identify ambiguity and run tests 
  Completion Date Goal: 2024-06-01
  Actual Completion Date: 2024-07-15
- Milestone Title: Finetuning 
  Milestone Description: Apply Finetuning, and other customizations to the LLMs to generate suitable text 
  Completion Date Goal: 2024-07-16
  Actual Completion Date: 2024-08-31
- Milestone Title: Finalize Documentation 
  Milestone Description: Produce a workflow that Write a white paper, prepare presentation and package any other deliverable/s. 
  Completion Date Goal: 2024-09-01
  Actual Completion Date: 2024-10-01
- Milestone Title: Wrap presentation 
  Milestone Description: Give a wrap presentation at the monthly meeting and have an exit interview.

Github Contributions: {Empty}
Planned Portal Contributions (if any):
{Empty}

Planned Publications (if any):
{Empty}

What will the student learn?:
The student will gain familiarity with Rutgers' HPC system, Amarel, and understand how to run NLP analysis using Amarel.  

What will the mentee learn?:
{Empty}

What will the Cyberteam program learn from this project?:
Jupyter notebooks with examples on how to run NLP analysis. 

HPC resources needed to complete this project?:
Access to the Amarel cluster, Rutgers' HPC system.

Notes:
{Empty}



Final Report
------------
What is the impact on the development of the principal discipline(s) of the project?:
{Empty}

What is the impact on other disciplines?:
{Empty}

Is there an impact physical resources that form infrastructure?:
{Empty}

Is there an impact on the development of human resources for research computing?:
{Empty}

Is there an impact on institutional resources that form infrastructure?:
{Empty}

Is there an impact on information resources that form infrastructure?:
{Empty}

Is there an impact on technology transfer?:
{Empty}

Is there an impact on society beyond science and technology?:
{Empty}

Lessons Learned:
{Empty}

Overall results:
{Empty}