Summary

The posts in this section are commentary on my participation in the Regina project.  It is intended to give people some perspective on what is going on in the project, and what has been done, as well as some of the problems we've encountered along the way which have led to the current design. Regina has a bigger agenda than my personal perspective, so do keep that in mind while reading.  This is very much biased towards the subset of Regina project goals that agree with my own!

I started working on Regina when I was a postdoc at the Max Planck Institute, back in 2005.  Before that I had postdoctoral positions with a relatively high teaching load.  MPI provided a great opportunity for me to not only finish a bunch of papers, but to read papers, and catch up on my interests.  One was Ben Burton's dissertation.  I downloaded Regina, and was very curious about the 3-sphere recognition algorithm.  The simplicity of the algorithm in Regina really impressed me.  And Ben's structuring of the code, putting it on a Subversion repository, the auto-documentation feature, all impressed me very much.  

I started to imagine that I could explore a problem Gregor Masbaum made me aware of while I was in Oregon -- the issue of which 3-manifolds embed smoothly in S4. So I got to understand some obstructions to embedding and noticed the Hantsche obstruction (using the torsion linking form) was something that could be algorithmically implemented, due to old work on classifying torsion linking forms by Kawauchi, Kojima and more classical work on torsion form classification for odd prime coefficients. 

It took me quite a while to finish implementing that code.  I think I finished it the next year, while at IHES.  My coding skills were rusty and Ben was coding in a more sophisticated manner than I was used to.  Until then, my main experience coding was primarily writing video games back in the 1980's. That was the birth of the NHomologicalData and NMarkedAbelianGroup classes in Regina.  To date, it appears the Hantshe obstruction is not only one of the easiest to compute, but it is also one of the most powerful obstructions in the literature.  

In 2008 we decided to implement a 4-manifold triangulation class, which Ben did almost entirely on his own while visiting Victoria.  This was before Ben acquired his position in Brisbane.  We needed robust methods to deal with Poincare duality for 4-manifolds, so the NHomologicalData class was discarded in favour of the more modular and lightweight NCellularData class, that could compute duality intersection forms for 3 and 4-manifolds.  It was becoming clear that as we were getting interested in more varieties of invariants of manifolds, we would have to make our code more modular in order to conserve memory and avoid duplication of efforts. The first draft of NCellularData was finished on my visit to Brisbane, in 2010.

A quick sketch of my primary contributions:

  • Build classes for chain complexes, and chain maps.  This allows us to have "coordinates" in the abelian group invariants of manifolds, allowing for the computations of natural homomorphisms, such as Poincare duality.  This is the NMarkedAbelianGroup class. 
  • NCellularData can also compute Poincare duality in covers of manifolds.  So in principle it should be able to (eventually) compute things like twisted Alexander polynomials.  At present it can compute plain single-variable Alexander polynomials.  I intend to do multi-variable Alexander polynomials soon(ish) but the algorithms I've been writing to compute Groebner basis have been rather unappealing. 
  • Regina uses exact arithmatic, meaning arbitrary precision integers when integers are called for.  This is pretty-much unavoidable if one wants to compute things like Alexander polynomials from chain complexes if you don't want errors to creep into your computations.  But it can also be a little cumbersome and memory intensive.  One place where exact arithmatic was clearly important was in computing Poincare duality.  For this we take the classical approach -- the manifold has its triangulation, the dual polyhedral decomposition, and the common refinement (the barycentric subdivision of the original triangulation).  When computing the homology of the barycentric subdivision of even a very small triangulation, one typically is computing Smith normal forms of matrices in the 200x200 size range.  Basic Smith normal form algorithms that one learns in the classification of finitely-generated abelian groups simply do not suffice for this kind of work -- the coefficients of even relatively sparse matrices can become larger than a 64-bit integer.  So we had to migrate to more intelligent Smith Normal Form algorithms.  The present one attempts to locally choose the row reduction operations to minimize the size of the matrix in the next step.  Similarly, large integers appear in algorithms to compute Groebner basis.
  • In the process of building the 4-manifold census we found the strongest first-line-of-attack on the problem was via the fundamental group.  So I've been revamping the NGroupPresentation code significantly, building a full implementation of small cancellation theory, as well a heuristic algorithm (i.e. it does not always work) to check if a group is an extension over the integers.  This is basically just the Reidemeister-Schreir algorithm. These kinds of algorithms should probably be extended and elaborated on but we haven't settled on a framework for more general algorithms, yet, due the the open-ended and non-algorithmic nature of group presentations.  The software Magnus does a great job of dealing with finitely-presented groups but that software has not been regularly maintained in some time and is difficult to get to compile on contemporary machines.  Ideally someone could modernise Magnus, turning it into a well-maintained and independent chunk of software that Regina simply compiled against.  Until something like that occurs, we will code what we need when we need it. 
  • I've hand-coded some classes to deal with Z2-central extensions of the symmetric groups S4 and S5.  This is to eventually implement my Combinatorial Spin Structures on Triangulated Manifolds paper in Regina.  This will be for the purpose of generating and manipulating spin and spinc 4-manifolds, for the purpose of things like Rochlin-invariant algorithms for 3-manifolds, or implementing other invariants from 4-manifold theory that use such structures.
  • I'm in the process of implementing a heuristic algorithm to determine if a triangulated manifold fibers over the circle.  Many of the core algorithms are complete, and the software is presently capable of showing that the Cappell-Shaneson knot Ben, Jon and I discovered is PL-equivalent to the standard Cappell-Shaneson knot.  In the paper we had only proved that there was a homeomorphism between the manifolds. I'm presently attempting to automate the process of finding bundle structures as much as possible, to minimize memory-consuption and (hopefully) get the algorithm efficient-enough that it can significantly impact our project to tabulate small triangulated PL 4-manifolds up to PL-homeomorphism. 

I will expand on many of the above topics in later posts. 

Regina uses arbitrary precision integers.

In contrast, almost every computer sold, from wristwatches to i-pads to massive computing clusters, represent integers using a standard data type where each integer takes a fixed amount of memory.  Nowadays, on most computers that's 32 or 64 bits of memory.  Most programming languages also allow for certain "long" integer data types that have twice to four times the amount of memory allocated per integer.  But what this means is your integers have a ceiling.  In all your computations, if you ever exceed that threshold your resulting answer will be likely be wrong. 

You may think, "when am I going to need integers larger than 2256? That's larger than 1077!".    But that misses one of the central facts about mathematics: some answers are difficult (or even impossible) to acquire.   For example, some algorithms are known to be in-principle slow.  Like the algorithm to find a Groebner basis for an ideal.  This is known to be a double-exponential run-time algorithm, and any algorithm that produces Groebner basis must be double-exponential run time (in worst-case scenarios).  

I ran into this phenomenon in a way that I found surprising at the time.  I had finished the algorithm to produce Groebner basis for single-variable Laurent polynomial rings in Regina and I was testing it.    The algorithm was working swimmingly most of the time.  It appeared to be quite responsive.  It was fast, it wasn't using much memory.  It was cranking out Alexander polynomials like they were nothing.  But there was one case where it was slow and using a lot of memory.  I looked into that case.  It was producing a very small Alexander polynomial at the end of the algorithm, but in intermediate steps it was producing polynomials with enormous coefficients -- larger than 1077.  

Of course, if one is only interested in the Alexander polynomial there are relatively fast and efficient ways of computing them.  In Regina, the Alexander polynomial is simply a "guide" to the deeper invariants of knots and manifolds, like Milnor signatures, which are also computable in Regina. 

Needless to say, I suspect we can do quite a bit to optimise the Groebner basis algorithm in Regina.