Submission Number: 193
Submission ID: 4444
Submission UUID: b608a896-cd66-436a-84f2-8cc176faef6d
Submission URI: /form/project

Created: Fri, 03/22/2024 - 11:56
Completed: Fri, 03/22/2024 - 11:56
Changed: Wed, 05/29/2024 - 15:36

Remote IP address: 71.58.230.184
Submitted by: Carrie Brown
Language: English

Is draft: No
Webform: Project
Project Title: Gender GAP in bankruptcy filings
Program:
CAREERS (323)

Project Image: {Empty}
Tags:
{Empty}

Status: In Progress
Project Leader
--------------
Project Leader:
Nonna Sorokina

Email: sorokina@psu.edu
Mobile Phone: {Empty}
Work Phone: {Empty}

Project Personnel
-----------------
Mentor(s):
{Empty}

Student-facilitator(s):
Mark Fahim (28393)

Mentee(s):
{Empty}


Project Information
-------------------
Project Description:
Use PACER datasets to collect bankruptcy filings and identify cases filed for business reasons, then design a web-scraper to collect names of the petitioners and other information from the bankruptcy petitions and further develop textual analysis-based routine to analyze names and classify bankruptcy filings by gender. 

Project Information Subsection
------------------------------
Project Deliverables:
{Empty}

Project Deliverables:
{Empty}

Student Research Computing Facilitator Profile:
{Empty}

Mentee Research Computing Profile:
{Empty}

Student Facilitator Programming Skill Level: {Empty}
Mentee Programming Skill Level: {Empty}
Project Institution: Penn State University
Project Address:
{Empty}

Anchor Institution: CR-Penn State
Preferred Start Date: {Empty}
Start as soon as possible.: No
Project Urgency: Already behind3Start date is flexible
Expected Project Duration (in months): {Empty}
Launch Presentation: {Empty}
Launch Presentation Date: 05/17/2024
Wrap Presentation: {Empty}
Wrap Presentation Date: {Empty}
Project Milestones:
- Milestone Title: Collect Bankruptcy Data
  Milestone Description: Write the code to obtain bankruptcy case filings 2008-present from the Federal Judicial Center web-site. Organize available data elements in the SAS dataset. Produce summary statistics of the data where applicable. Codify textual categorical responses where possible for further data analysis. Validate data and ensure proper formatting. 
  Completion Date Goal: 2024-05-10
- Milestone Title: Develop NLP code to classify petitions by gender 
  Milestone Description: Obtain list of names from the dataset produced in (1). Develop the code to scrape web for gender identifiers associated with the names. Use NLP logic to associate names with gender. Assign gender identifier to the bankruptcy filings based on names. 
  Completion Date Goal: 2024-06-10
- Milestone Title: Assist with statistical analysis 
  Milestone Description: Add local economic and run regressions in SAS to explore gender-based distribution of bankruptcy filings.
  Completion Date Goal: 2024-07-10

Github Contributions: {Empty}
Planned Portal Contributions (if any):
{Empty}

Planned Publications (if any):
{Empty}

What will the student learn?:
{Empty}

What will the mentee learn?:
{Empty}

What will the Cyberteam program learn from this project?:
{Empty}

HPC resources needed to complete this project?:
{Empty}

Notes:
{Empty}



Final Report
------------
What is the impact on the development of the principal discipline(s) of the project?:
{Empty}

What is the impact on other disciplines?:
{Empty}

Is there an impact physical resources that form infrastructure?:
{Empty}

Is there an impact on the development of human resources for research computing?:
{Empty}

Is there an impact on institutional resources that form infrastructure?:
{Empty}

Is there an impact on information resources that form infrastructure?:
{Empty}

Is there an impact on technology transfer?:
{Empty}

Is there an impact on society beyond science and technology?:
{Empty}

Lessons Learned:
{Empty}

Overall results:
{Empty}