Knowledge Base Resources

Contributed by cyberinfrastructure professionals (researchers, research computing facilitators, research software engineers and HPC system administrators), these resources are shared through the ConnectCI community platform. Add resources you find helpful!

Add a Resource

Introduction to Probabilistic Graphical Models

https://ermongroup.github.io/cs228-notes/

This website summarizes the notes of Stanford's introductory course on probabilistic graphical models. It starts from the very basics and concludes by explaining from first principles the variational auto-encoder, an important probabilistic model that is also one of the most influential recent results in deep learning.

ai machine-learning

0 Likes

Type

learning

Level

Master’s in Cybersecurity Degree Essentials

Offers comprehensive information on various master's degree options in cybersecurity, including program details, admission requirements, and career opportunities, helping students make informed decisions about pursuing an advanced degree in cybersecurity.

resources professional-development cybersecurity

0 Likes

Type

website

Level

Understanding LLM Fine-tuning

The Ultimate Guide to LLM Fine Tuning: Best Practices & Tools

With the recent uprising of LLM's many business are looking at way to adopt these LLMs and fine-tuning these models on specfic data sets to ensure accuracy. These models when fine-tuned can be optimal for fulfilling the specific needs of a company. This site explains explicitly when, how, and why models should be trained. It goes over various strategies for LLM fine -tuning.

big-data training

0 Likes

Type

learning

Level

Hour of Ci

Hour of CI

Hour of Cyberinfrastructure (Hour of CI) is a nationwide campaign to introduce undergraduate and graduate students to cyberinfrastructure and geographic information science (GIS).

arcgis gis administering-hpc

0 Likes

Type

learning

Level

ACCESS Events and Training

Events and Training

Listing of upcoming ACCESS related events and training activities.

professional-development training workforce-development

0 Likes

Type

website

Level

AHPCC documentary

Arkansas High Performance Computing Center

This link is a documentary website to use AHPCC.

0 Likes

Type

documentation

Level

How to Build a Great Relationship with a Mentor

How to Build a Great Relationship with a Mentor

Emphasizes benefits of being mentored. Describes how to identify and choose a mentor. Suggests a path forward. Not mentor or two-way focused.

mentorship professional-development workforce-development

0 Likes

Type

website

Level

Performance Engineering Of Software Systems

MIT Performance Engineering Of Software Systems Homepage

A class from MITOpenCourseware that gives a hands on approach to building scalable and high-performance software systems. Topics include performance analysis, algorithmic techniques for high performance, instruction-level optimizations, caching optimizations, parallel programming, and building scalable systems.

optimization parallelization training

0 Likes

Type

learning

Level

Examples of code using JSON nlohmann header only Library for C++

This code showcases how to work with the header-only nlohmann JSON library for C++. In order to compile, change the extensions from json_test.txt to json_test.cpp and test.txt to test.json. You must also download the header files from https://github.com/nlohmann/json. Complilation instructions are at the bottom of json_test. This code is very helpful for creating config files, for example.

c++

0 Likes

Type

learning

Level

Displaying Scientific Data with Tableau

Displaying Scientific Data with Tableau

Tableau is a popular and capable software product for creating charts that present data and dashboards that allow you to explore data. It is typically used to present business or statistical data, but can also create compelling visualizations of scientific data. However, scientific data is often generated or stored in formats that are not immediately accessible by Tableau. This seminar will explore the data formats that work best with Tableau and the available mechanisms for generating scientific data in (or converting it to) those formats so that you can apply the full power of Tableau to create the best possible visualizations of your data.

big-data data-analysis training workforce-development

0 Likes

Type

video_link

Level

Machine Learning in Astrophysics

Machine learning is becoming increasingly important in field with large data such as astrophysics. AstroML is a Python module for machine learning and data mining built on numpy, scipy, scikit-learn, matplotlib, and astropy allowing for a range of statistical and machine learning routines to analyze astronomical data in Python. In particular, it has loaders for many open astronomical datasets with examples on how to visualize such complicated and large datasets.

plotting big-data image-processing machine-learning astrophysics

0 Likes

Type

documentation

Level

CI Computing Module For All

Computing Module: Introduces fundamental concepts and skills of Cyberinfrastructure (CI) and High-Performance Computing (HPC) to lower the barrier to becoming CI users in disaster management research. The module will cover the critical topics of CI and HPC with hands-on sessions. Disaster Data Module: Introduces concepts of geospatial big data in disaster management. Students will learn how to access and process disaster data. Geospatial Analytic Module: Introduces geospatial analytics skills to address real-world challenges in disaster management. The module will use the data introduced in the Disaster Data Module and cover various geospatial analytics topics such as geosimulation, spatial optimization, network analysis, terrain analysis, Geospatial Artificial Intelligence (GeoAI), social sensing, and CyberGIS.

0 Likes

Type

learning

Level

AI/ML TechLab - Accelerating AI/ML Workflows on a Composable Cyberinfrastructure

This technology lab contains a set of sessions to help a new user start an AI project on the ACES cluster, a composable accelerator testbed at Texas A&M University. You will learn how to create and activate a virtual environment, manipulate and visualize data with Pandas and Matplotlib, use Scikit-learn for linear regression and classification applications, and use Pytorch to create and train a simple image classification model with deep neural networks (DNN).

ACES documentation TAMU ai visualization deep-learning machine-learning neural-networks login authentication composable-systems gpu nvidia slurm bash modules vim anaconda conda programming python scikit-learn

0 Likes

Type

documentation

Level

Open Storage Network

Open Storage Network

The Open Storage Network, a national resource available through the XSEDE resource allocation system, is high quality, sustainable, distributed storage cloud for the research community.

open-storage-network data-management data-retention storage hpc-storage

0 Likes

Type

website

Level

How the Little Jupyter Notebook Became a Web App: Managing Increasing Complexity with nbdev

Tutorial Site

A tutorial entitled "How the Little Jupyter Notebook Became a Web App: Managing Increasing Complexity with nbdev" presented at SciPy 2023 in Austin, TX. This tutorial is hosted in a series of Jupyter Notebooks which can be accessed in the click of a button using Binder. See the README for more information.

0 Likes

Type

learning

Level

Electric field analyses for molecular simulations

TUPA - Electric field analyses for molecular simulations

Tool to compute electric fields from molecular simulations

visualization computational-chemistry conda python

0 Likes

Type

tool

Level

File management of Visual Studio Code on clusters

VS Code installation

Visual Studio Code, commonly known as VSCode, is a popular tool used by programmers worldwide. It serves as a text editor and an Integrated Development Environment (IDE) that supports a wide variety of programming languages. One of its key features is its extensive library of extensions. These extensions add on to the basic functionalities of VSCode, making coding more efficient and convenient. However, there's a catch. When these extensions are installed and used frequently, they generate a multitude of files. These files are typically stored in a folder named .vscode-extension within your home directory. On a cluster computing facility such as the FASTER and Grace clusters at Texas A&M University, there's a limitation on how many files you can have in your home directory. For instance, the file number limit could be 10000, while the .vscode-extension directory can hold around 4000 temporary files even with just a few extensions. Thus, if the number of files in your home directory surpasses this limit due to VSCode extensions, you might face some issues. This restriction can discourage users from taking full advantage of the extensive features and extensions offered by the VSCode editor. To overcome this, we can shift the .vscode-extension directory to the scratch space. The scratch space is another area in the cluster where you can store files and it usually has a much higher limit on the number of files compared to the home directory. We can perform this shift smoothly using a feature called symbolic links (or symlinks for short). Think of a symlink as a shortcut or a reference that points to another file or directory located somewhere else. Here's a step-by-step guide on how to move the .vscode-extension directory to the scratch space and create a symbolic link to it in your home directory: 1. Copy the .vscode-extension directory to the scratch space: Using the cp command, you can copy the .vscode-extension directory (along with all its contents) to the scratch space. Here's how: cp -r ~/.vscode-extension /scratch/user Don't forget to replace /scratch/user with the actual path to your scratch directory. 2. Remove the original .vscode-extension directory: Once you've confirmed that the directory has been copied successfully to the scratch space, you can remove the original directory from your home space. You can do this using the rm command: rm -r ~/.vscode-extension It's important to make sure that the directory has been copied to the scratch space successfully before deleting the original. 3. Create a symbolic link in the home directory: Lastly, you'll create a symbolic link in your home directory that points to the .vscode-extension directory in the scratch space. You can do this as follows: ln -s /scratch/user/.vscode-extension ~/.vscode-extension By following this process, all the files generated by VSCode extensions will be stored in the scratch space. This prevents your home directory from exceeding its file limit. Now, when you access ~/.vscode-extension, the system will automatically redirect you to the directory in the scratch space, thanks to the symlink. This method ensures that you can use VSCode and its various extensions without worrying about hitting the file limit in your home directory.

faster file-limit scratch file-transfer

0 Likes

Type

learning

Level

Introduction to GPU/Parallel Programming using OpenACC

Intro to OpenACC

Introduction to the basics of OpenACC.

gpu c c++compiling fortran

0 Likes

Type

presentation

Level

GIS: What is a Geodetic Datums?

What are Geodetic Datums?

Often when working with GIS, or spatial data, one encounters the word "datum" and it may require that you choose a "datum" when doing GIS computation tasks. Below is a short video on what are datums from NOAA and UCAR.

arcgis gis

0 Likes

Type

learning

Level

AI for improved HPC research - Cursor and Termius - Powerpoint

Powerpoint - Cursor and Termius benefits for HPC

These slides provide an introduction on how Termius and Cursor, two new and freemium apps that use AI to perform more efficient work, can be used for faster HPC research.

documentation ai machine-learning ssh programming programming-best-practices python terminal-emulation-and-window-management

0 Likes

Type

presentation

Level

Fundamentals of R Programming

This course is an introduction to the R programming language and covers the fundamental concepts needed to operate in the R environment. This course was taught for the ACCESS community on September 26, 2023, but the materials for the course are still available on the ACES cluster and can be completed independently. All materials are presented as learnR notebooks and cover several topics, including data types, variables, built-in functions, data structures, and plotting.

ACES TAMU plotting data-analysis r

0 Likes

Type

learning

Level

Educause HEISC-800-171 Community Group

Educause HEISC-800-171 Community Group

The purpose of this group is to provide a forum to discuss NIST 800-171 compliance. Participants are encouraged to collaborate and share effective practices and resources that help higher education institutions prepare for and comply with the NIST 800-171 standard as it relates to Federal Student Aid (FSA), CMMC, DFARS, NIH, and NSF activities.

cybersecurity

0 Likes

Type

website

Level

Framework to help in scaling Machine Learning/Deep Learning/AI/NLP Models to Web Application level

Framework to help in scaling Machine Learning/Deep Learning/AI/NLP Models to Web Application level

This framework will help in scaling Machine Learning/Deep Learning/Artificial Intelligence/Natural Language Processing Models to Web Application level almost without any time.

ai deep-learning machine-learning neural-networks

0 Likes

Type

learning

Level

Intro to Statistical Computing with Stan

The Stan language is used to specify a (Bayesian) statistical model with an imperative program calculating the log probability density function. Here are some useful links to start your exploration of this statistical programming language, and a Python interface to Stan.

data-analysis machine-learning monte-carlo python

0 Likes

Type

documentation

Level

Cybersecurity Guide

Cybersecurity Guide

Cybersecurity Guide is a comprehensive resource for students and early career professionals that provides users with a wide range of resources and up-to-date information on cybersecurity, including cybersecurity degree programs and bootcamps, career guides, as well as online courses and training opportunities. Additionally, it covers trends, best practices, and much more.

resources training data-security cybersecurity

0 Likes

Type

website

Level