Submission information
Submission Number: 193
Submission ID: 4444
Submission UUID: b608a896-cd66-436a-84f2-8cc176faef6d
Submission URI: /form/project
Created: Fri, 03/22/2024 - 11:56
Completed: Fri, 03/22/2024 - 11:56
Changed: Wed, 05/29/2024 - 15:36
Remote IP address: 71.58.230.184
Submitted by: Carrie Brown
Language: English
Is draft: No
Webform: Project
Project Title: Gender GAP in bankruptcy filings Program: CAREERS (323) Project Image: {Empty} Tags: {Empty} Status: In Progress Project Leader -------------- Project Leader: Nonna Sorokina Email: sorokina@psu.edu Mobile Phone: {Empty} Work Phone: {Empty} Project Personnel ----------------- Mentor(s): {Empty} Student-facilitator(s): Mark Fahim (28393) Mentee(s): {Empty} Project Information ------------------- Project Description: Use PACER datasets to collect bankruptcy filings and identify cases filed for business reasons, then design a web-scraper to collect names of the petitioners and other information from the bankruptcy petitions and further develop textual analysis-based routine to analyze names and classify bankruptcy filings by gender. Project Information Subsection ------------------------------ Project Deliverables: {Empty} Project Deliverables: {Empty} Student Research Computing Facilitator Profile: {Empty} Mentee Research Computing Profile: {Empty} Student Facilitator Programming Skill Level: {Empty} Mentee Programming Skill Level: {Empty} Project Institution: Penn State University Project Address: {Empty} Anchor Institution: CR-Penn State Preferred Start Date: {Empty} Start as soon as possible.: No Project Urgency: Already behind3Start date is flexible Expected Project Duration (in months): {Empty} Launch Presentation: {Empty} Launch Presentation Date: 05/17/2024 Wrap Presentation: {Empty} Wrap Presentation Date: {Empty} Project Milestones: - Milestone Title: Collect Bankruptcy Data Milestone Description: Write the code to obtain bankruptcy case filings 2008-present from the Federal Judicial Center web-site. Organize available data elements in the SAS dataset. Produce summary statistics of the data where applicable. Codify textual categorical responses where possible for further data analysis. Validate data and ensure proper formatting. Completion Date Goal: 2024-05-10 - Milestone Title: Develop NLP code to classify petitions by gender Milestone Description: Obtain list of names from the dataset produced in (1). Develop the code to scrape web for gender identifiers associated with the names. Use NLP logic to associate names with gender. Assign gender identifier to the bankruptcy filings based on names. Completion Date Goal: 2024-06-10 - Milestone Title: Assist with statistical analysis Milestone Description: Add local economic and run regressions in SAS to explore gender-based distribution of bankruptcy filings. Completion Date Goal: 2024-07-10 Github Contributions: {Empty} Planned Portal Contributions (if any): {Empty} Planned Publications (if any): {Empty} What will the student learn?: {Empty} What will the mentee learn?: {Empty} What will the Cyberteam program learn from this project?: {Empty} HPC resources needed to complete this project?: {Empty} Notes: {Empty} Final Report ------------ What is the impact on the development of the principal discipline(s) of the project?: {Empty} What is the impact on other disciplines?: {Empty} Is there an impact physical resources that form infrastructure?: {Empty} Is there an impact on the development of human resources for research computing?: {Empty} Is there an impact on institutional resources that form infrastructure?: {Empty} Is there an impact on information resources that form infrastructure?: {Empty} Is there an impact on technology transfer?: {Empty} Is there an impact on society beyond science and technology?: {Empty} Lessons Learned: {Empty} Overall results: {Empty}