The Magic Whiteboard - Part III

Subtitle: The Robot Hacker

The Hacker

The Hacker gives you several programming languages for scientific research as well as tools to analyze data and construct machine-learning methods for datasets. The Hacker is mostly built using JupyterLab notebooks which let you insert notes and comments about the code you write using the same formatting that’s available for the Magic Whiteboard. You can insert LaTeX math directly into your notes.

In principle, you could use JupyterLab to write your document since you can intersperse notes and code. We think that it’s more appropriate to use the Magic Whiteboard for your main document and to reserve The Hacker for coding. The main results should be included in the document, but the coding details should be in an attached appendix. Most journals expect this separation, so having a space for writing and a space for coding seems more natural.

Just like the Magic Whiteboard and the Librarian, the Hacker is equipped with AI assistance, a conversational large language model called Jupyternaut. You will still have access to the AI systems installed for the Magic Whiteboard since The Hacker runs in a tab in Obsidian.

jupyternaut

Jupyter and Jupyternaut AI

Mamba, Kernels, Environments, and Projects

When you start to work on a computational project you have to decide which language you want to use. There may be packages or extensions to the language that will be helpful to your project, so you’ll have to load those into your workspace as well. Packages usually depend on many functions or subroutines which need to be included. Rather than attempting to keep track of all of the sub-packages required yourself, you need a package manager.

If you search for package managers for Jupyter you’ll quickly find Anaconda, but there’s an alternative called Mamba. Mamba is supposed to be a drop-in replacement for Anaconda but is much faster and more likely to succeed in finding the correct dependencies for each package. For each language, you need to install a kernel which “is a programming language-specific process that executes the code contained in a Jupyter notebook”. A kernel connects Jupyter to a computer language.

An environment within Jupyter contains all of the packages you need for the project you’re working on. You could think of it as an isolated folder on your computer containing everything required for one project. You should create a new environment for each project because the environment contains versions of the code for the project. By keeping everything isolated you can share not only the code you wrote, but exactly the software needed to run it which makes the project perfectly reproducible.

If you want to try Mamba and already have Anaconda installed, first remove Anaconda and any associated folders. Follow the instructions for installing Mamba using the Miniforge distribution which will create one environment called base. On Windows, the installer does not add Minforge to the PATH environment variable and doesn’t add a link to the start menu by default, but you should select both of these options.

The Mamba Troubleshooting page explains that the Anaconda default channels should not be used, and no other packages should be installed in the base environment. Channels are online repositories of downloadable packages. Start the Miniforge prompt, type mamba info, and follow the directions on the Troubleshooting page to remove any Anaconda links containing pkgs/main, pkgs/r/R, msys2, or defaults. If the channel URLs contain only conda-forge you should be fine.

Finally, whatever you might read elsewhere, don’t install anything in the base environment besides Mamba. You should install Jupyter in an environment other than the base environment when using conda or Mamba. Here are a few reasons why:

Even when installed in its own environment, Jupyter will still be available to other environments. The Jupyter server process runs outside of any particular environment, so it can access kernels and packages from any environment you have configured.

The key is that you need to make sure the ipykernel package is installed in each environment you want to connect to Jupyter. This registers the kernel with the Jupyter server. But Jupyter itself can remain in a dedicated environment and serve kernels from all your other environments.

To be able to use several different languages in JupyterLab such as Julia, Octave, Python, and R you can either install the kernels in the same environment as Jupyter or give each language kernel a separate environment:

  1. Install the kernels in the same environment as Jupyter:
  1. Install kernels in their own language environments:

The second approach of installing kernels in separate environments keeps things more isolated. But the kernels will still work with the Jupyter server as long as you install the kernel integration packages.

So in summary, you can use either approach based on your specific needs and preferences. The key is installing the kernel packages (like ipykernel) and registering with Jupyter through the kernelspecs. Jupyter is flexible enough to work with kernels from any environment.

The advantage of separate environments is isolation, while a shared Jupyter environment reduces duplication of installs. But both methods will enable the use of multiple languages with JupyterLab.

Constructing the Hacker

To install JupyterLab make a new environment named “Jupyter” or something similar using

mamba create --name "Jupyter" python=<ver no>

where you specify the latest version of Python to be included in the Jupyter environment. Check that the new environment was created with mamba env list, and then activate it with activate Jupyter which should change the prompt to (Jupyter) C:\Users\<your name> or similar if you’re not on a Windows machine. Next, activate the Jupyter environment with activate Jupyter and then start a JupyterLab notebook using the command jupyter lab.

To be able to use different computer languages, you need to install kernels for each. The official list of available kernels is here, and there’s an alternate list on GitHub. By default, you’ll have a kernel for Python, and installing others is relatively easy. To install Julia follow the instructions for installing IJulia and run build IJulia in the Julia package manager.

Most kernels are built with IPython, but an alternative is Xeus which gives Octave a much nicer interface in Jupyter, although it isn’t available for Windows systems yet. Xeus is being developed at QuantStack which builds open-source software for science and education.

Managing Jupyter Kernels in JupyterLab” is a good resource for understanding Jupyter kernels, and “What to do when things go wrong” from Jupyter is also very handy.

From the official list, the ones you might like to start with are Julia, Maxima, Octave, PARI/GP (number theory - not available for Windows), and Wolfram Language. But, there are other useful packages, and some haven’t been included in the list. One add-on that should be installed first is Jupyter AI which provides generative AI to Jupyter notebooks.

The set language SetlX is installed by following the installation instructions and then running the commands shown under Setup for iSetlX in the Mamba window. To begin your introduction to SetlX and the theory of sets, see the accompanying tutorial by Karl Stroetmann and Tom Herrmann.

Since kernels are mostly community-built, some won’t work as well as others. The Maxima kernel is one example that doesn’t work as expected, but wxMaxima provides a nice notebook interface. If you want to use wxMaxima from The Hacker add a link to the wxMaxima start function (wxMaxima.exe on Windows). The Wolfram Language gives similar capabilities as wxMaxima and can be installed in Jupyter. (see The Big Squish Theory - Part I).

wxmaxima-notebook

wxMaxima Notebook Example

Some software tools have Jupyter kernels but aren’t listed in the two lists above. The Lean Theorem Prover can be run from Jupyter using Lean-To and OpenModelica, a modeling and simulation environment, uses jupyter-openmodelica to connect to Jupyter.

To check that the kernels have been installed correctly, run

jupyter kernelspec list

If you see one listed as ‘ir’, that’s the R kernel. If you already have a kernel installed but want to remove it or update it to the latest version first remove the existing kernel with jupyter kernelspec uninstall {KERNEL_NAME} and then re-install it.

Jupyter AI

For JupyterLab notebooks Jupyter AI gives conversational assistance to your coding questions. The documentation (see ReadTheDocs in the Jupyter AI Github pag) provides detailed instructions for installation and use. You should have the latest version of Python after installing Mamba, but you can check by starting the command prompt (in the MagicWhiteboard environment), and then typing python which should generate something like:

Python 3.11.5 | packaged by Anaconda, Inc. | (main, Sep 11 2023, 13:26:23) [MSC v.1916 64 bit (AMD64)] on win32

Enter exit() to quit Python. Now follow the directions for installing Jupyter AI using conda (or Mamba)

# change 4.0 to 3.0 if you need JupyterLab 3
conda config --add channels conda-forge
conda config --set channel_priority strict
conda install jupyterlab~=4.0

Online science tools

Some math and science software tools don’t have Jupyter kernels but are available online. In Obsidian, you can build links to these just as you did for the online literature search tools described in The AI Librarian, and clicking on one of the links will open the app in a new window in Obsidian. Examples of online tools you might like to include are listed below. Arrange them on the Obsidian page using the Multi-Column Markdown plugin.

You could also link to any software that you’ve downloaded locally to your machine, but they would open in the usual way, and not in an Obsidian window.

Data Analysis

For data analysis, there are several very good options. One is Knime which includes the Knime AI Assistant to help build analytic models. Knime uses visual programming to construct workflows.

knime-workflow

Knime Workflow

Components such as Excel Reader can be dropped onto the worksheet and connected to other drag-and-drop components to create a complete data analysis system. New components can be written in Python or downloaded from the Community Hub. Similar tools are the Orange Data Mining platform and Weka. If your data is in .csv format, you might want to consider the plugin JupyterLab Spreadsheet.

Another option is to use Google Colaboratory (Colab), “a hosted Jupyter Notebook service that requires no setup to use and provides free access to computing resources, including GPUs and TPUs.” Colab is mainly designed to use Python and now has AI coding assistance using Codey. Plug fast.ai into Colab and you’ll be “Making neural nets uncool again”. The founders of fastai, Jeremy Howard and Rachel Thomas, also provide a book, Deep Learning for Coders with Fastai and PyTorch: AI Applications Without a PhD, two free online courses, Practical Deep Learning for Coders and From Deep Learning Foundations to Stable Diffusion to go with the code.

Using AI for Science

Tools like The Magic Whiteboard are already being used to solve math and science problems. Sergei Gukov a professor of mathematics and physics at CalTech is one of the organizers of the Mathematics and Machine Learning 2023 conference. He says, “There are some mathematicians who may still be skeptical about using the tools. The tools are mischievous and not as pure as using paper and pencil, but they work.”

Gukov and his colleagues are working on the smooth Poincaré conjecture in 4-dimensions using the assistance of AI which you can read about in SciTech Daily.

Connecting-Math-and-Machine-Learning

The Intersection of Math and AI: A New Era in Problem-Solving

Alhussein Fawzi and Bernardino Romera Paredes describe using FunSearch, an evolutionary method powered by LLMs to help solve the cap set problem, in FunSearch: Making new discoveries in mathematical sciences using Large Language Models.

New AI resources arrive daily to help you assemble your Magic Whiteboard, but you now have the outline for building the basic whiteboard in Obsidian, the AI-assisted Librarian and Hacker ready to help with all of your STEM ideas.


Image credits

Hero: The Sequence, image created with DALL-E

Jupyter and Jupyternaut AI:

wxMaxima Notebook Example: Solving algebra problems step by step with wxMaxima

Knime Workflow: From Spreadsheets to Workflows

SciTech Daily: The Intersection of Math and AI: A New Era in Problem-Solving


Jupyter

Lean Theorem Prover

Mamba and Gator

Science and Math AI Articles

Similar AI Projects


Software

Obsidian

Obsidian is a personal knowledge base and note-taking software application that operates on Markdown.

Posts using Obsidian

Mamba

Mamba is an open-source package manager for science.

JupyterLab

JupyterLab is the latest web-based interactive development environment for notebooks, code, and data. Its flexible interface allows users to configure and arrange workflows in data science, scientific computing, computational journalism, and machine learning.

Posts using JupyterLab

Julia

The Julia Project as a whole is about bringing usable, scalable technical computing to a greater audience: allowing scientists and researchers to use computation more rapidly and effectively; letting businesses do harder and more interesting analyses more easily and cheaply.

Posts using Julia

Lean Theorem Prover

Lean is a functional programming language that makes it easy to write correct and maintainable code. You can also use Lean as an interactive theorem prover.

wxMaxima

wxMaxima is a document based interface for the computer algebra system Maxima. Maxima is a system for the manipulation of symbolic and numerical expressions, including differentiation, integration, Taylor series, Laplace transforms, ordinary differential equations, systems of linear equations, polynomials, sets, lists, vectors, matrices and tensors. Maxima yields high precision numerical results by using exact fractions, arbitrary-precision integers and variable-precision floating-point numbers. Maxima can plot functions and data in two and three dimensions.

Octave

GNU Octave is a high-level language, primarily intended for numerical computations. It provides a convenient command line interface for solving linear and nonlinear problems numerically, and for performing other numerical experiments using a language that is mostly compatible with Matlab.

Posts using Octave

Pari/GP

PARI/GP is a widely used computer algebra system designed for fast computations in number theory (factorizations, algebraic number theory, elliptic curves, modular forms, L functions…), but also contains a large number of other useful functions to compute with mathematical entities such as matrices, polynomials, power series, algebraic numbers etc., and a lot of transcendental functions.

Posts using Pari/GP

Wolfram Language

Wolfram Language is a symbolic language, deliberately designed with the breadth and unity needed to develop powerful programs quickly. By integrating high-level forms—like Image, GeoPolygon or Molecule—along with advanced superfunctions—such as ImageIdentify or ApplyReaction—Wolfram Language makes it possible to quickly express complex ideas in computational form.

Posts using Wolfram Language

Knime

KNIME Analytics Platform is open source software for creating data science applications and services.

Orange

Open source machine learning and data visualization.

Weka

Weka is a collection of machine learning algorithms for data mining tasks. It contains tools for data preparation, classification, regression, clustering, association rules mining, and visualization.

Colab

Colab is a hosted Jupyter Notebook service that requires no setup to use and provides free access to computing resources, including GPUs and TPUs. Colab is especially well suited to machine learning, data science, and education.

fast.ai

fastai simplifies training fast and accurate neural nets using modern best practices. fastai is a deep learning library which provides practitioners with high-level components that can quickly and easily provide state-of-the-art results in standard deep learning domains, and provides researchers with low-level components that can be mixed and matched to build new approaches.

fast.ai—Making neural nets uncool again. You can use fastai without any installation by using Google Colab.

See all software used on wildpeaches →