Skip to content

vla6/Blog_naics_nn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Blog_naics_nn

Work for Towards Data Science

Demonstrating random injection of "unseen" encoding values during neural network training using a custom data generator.

Line plots that show that model performance for unseen codes increases drastically with stochastic regularization

Data Disruptions to Elevate Entity Embeddings

Read the article at Medium or TDS

The version of the data for the blog post is saved in the data_disruptions release

Table data is in the top level in the "tables.xlsx" document.

Code is at the top level; notebooks would run in order. Metrics are collected and summarized in 80_perf_summary.ipynb.

Running Code

First, download the SBA Loans Dataset from Kaggle.

Then, change setup.py

  • Make input_path point to the SBA Loans dataset on your system
  • temp_path should point to a writeable directory on your system

For more information on hardware requirements and package installation, see: https://github.com/vla6/Blog_gnn_naics?tab=readme-ov-file#blog_gnn_naics

Visualizing Stochastic Regularization for Entity Embeddings

See subfolder "_A_embeddings" and its README.md

About

Investigations of stochastic randomization for entity embeddings in neural network models, including visualizations of embeddings

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published