December 14, 2014

Vince Knight

A busy term

Here’s my final reflective post for Imogen Dunne’s final year project (you can find the first here, the second here and the third here).

The main feeling I have as I reflect on this past term is how tired I am.

This term/semester/thing has been so amazingly busy. Before I go any further I need to thank my PhD student Jason Young. Jason has just started his PhD and has been acting as a full teaching assistant for me with this course. I have no idea how this course ran without having someone to help me last year and this leads me to my first point of reflection: students have engaged very well with the course.

I believe (hope is perhaps a less biased word) that the main reason this has been busy is because this has been an active learning experience for my students. This could be due to the flipped learning environment (I would like to think so) but also perhaps just having a really greatly engaged set of students.

Here is a list of things that I plan on changing for next year:

  • Another set of videos - I’m going to double the amount of videos I have. I think this is a natural thing to do as I’m more aware of the difficulties students have. This is all part of the reflective pedagogy that revolves around the concept of being able to react in a timely manner to feedback as to student difficulties (if I’m not doing this than I am no better than a book). The reflective approach ensures a dynamic reaction to difficulties which can happen on various scales:

    • Class meetings: based on difficulties during the week (micro level)
    • From year to year: based on major trends during the year (macro level)
  • More scaffolding on technical issues - I always underestimate the starting point of some students. I need to do a better job helping them with simple things like using a mouse, using internet browsers and also more tricky things like debugging LaTeX.

  • As discussed before: a much better scaffolding of student tutors (the undergrads).

On the whole though I am very pleased with how this year has gone. In particular I feel that a number of students have not only learnt to code but also understood the importance of learning to code in conjunction with learning mathematics. Here is a quote that I’m taking from one of the 3 page papers that students have had to hand in at the end of term (this particular one looking at the futurama theorem):

“It has to be said, when I first looked at Keeler’s proof, the whole thing did seem incredibly complicated. But after coding it and using it in ‘real life’, it is pretty straight forward.”

This is really nice to read as it’s something I say a lot: coding complicated mathematics will often help understand it.

It reminds me a bit about this nice lightning talk that one of the students gave last year:

December 14, 2014 12:00 AM

December 11, 2014

David Horgan (quantumtetrahedron)


In order to follow up some work on the the last two posts I have been looking at  the capabilities of Sagemath with regard to calculating hypergeometric functions:

Fortunately sagemath can implement at number of great codes for hypergeometric functions including:

These give excellent results:





The ability to use mpmath code is very useful since it enables me to calculate a wide range of hypergeometric functions as can be seen on the reference pages. I’ll be using this over the coming posts starting with :

Exact Computation and Asymptotic Approximations of 6j Symbols:
Illustration of Their Semiclassical Limits which is in preparation.

Related articles


by David Horgan at December 11, 2014 08:49 PM

December 10, 2014

Vince Knight

A Sneak Preview of Game Theory in Sage (3/3): Normal Form Games

In two previous posts I have discussed two game theoretical concepts that are in/on their way in to Sage:

Since writing those and alluding to more development that myself and an undergraduate here at Cardiff are working on, I’ve had a fair few people asking about when Normal Form Games will be included…

The purpose of this post is to say: extremely soon! A NormalFormGame class has now also been merged in to the develop branch of Sage (so it will be in the next release).

What is a normal form game?

These are sometimes referred to as bi-matrix games or strategic form games. I wrote a blog post about these in reference to choosing a side of the pavement to walk on: here.

In this post I’ll take a look at what the new Sage class allows you to do.

Consider the game I used to model two individuals walking on either the left or the right of the pavement:

\[ A = \begin{pmatrix} 1&-1\\
-1&1 \end{pmatrix} \] \[ B = \begin{pmatrix} 1&-1\\
-1&1 \end{pmatrix} \]

Matrix \(A\) gives the utility to the first person (assuming they control the rows) and the matrix \(B\) gives the utility to the second person (assuming they control the columns). So if both individuals walk on their left then they both get a utility of 1 (ie they don’t bump in to each other).

Defining a game

We can define these two matrices in Sage and will be able to define a NormalFormGame as follows:

sage: A = matrix([[1, -1],[-1, 1]])
sage: B = matrix([[1, -1],[-1, 1]])
sage: g = NormalFormGame([A, B])

As you can see the NormalFormGame class uses a list of two matrices to construct a game. If you look at the documentation you’ll see that there are other ways to construct games. To see that this has indeed worked we can just see the output of the game:

sage: g
Normal Form Game with the following utilities: {(0, 1): [-1, -1], (1, 0): [-1, -1], (0, 0): [1, 1], (1, 1): [1, 1]}

This displays a dictionary of the strategy:utility pairs. The form of the output is actually based on gambit syntax.

We can use this class to very easily obtain equilibria of games:

Finding Nash equilibria

sage: g.obtain_nash()
[[(0, 1), (0, 1)], [(1/2, 1/2), (1/2, 1/2)], [(1, 0), (1, 0)]]

The output shows that there are three Nash equilibria for this game:

  • Both players walking on their right;
  • Both players walking on their left;
  • Both players alternating from left to right with 50% probability.

There are currently 2 algorithms implemented in Sage to calculate equilibria:

  1. A support enumeration algorithm (this is not terribly efficient but is written in pure Sage so will work even if optional packages are not installed and for typical game sizes will work just fine).
  2. A reverse search algorthim which calls the optional ‘lrs’ library.

My student and I are currently actively developing further integration with gambit which will allow for a linear complementarity algorithm and also solution algorithms for games with more than 2 players.

Here is one other example:

sage: A = matrix([[0, -1, 1, 1, -1],
....:             [1, 0, -1, -1, 1],
....:             [-1, 1, 0, 1 , -1],
....:             [-1, 1, -1, 0, 1],
....:             [1, -1, 1, -1, 0]])
sage: g = NormalFormGame([A])
sage: g.obtain_nash(algorithm='lrs')
[[(1/5, 1/5, 1/5, 1/5, 1/5), (1/5, 1/5, 1/5, 1/5, 1/5)]]

As you can see above, I’ve created a game by just passing a single matrix: this automatically creates a zero sum game. I’ve also told Sage to make sure it uses the 'lrs' algorithm (although 'enumeration' would handle this 5 by 5 game just fine).

Finally if you’re not actually sure what that game is take a look at this little video:

I’m very excited to see this in Sage (soon!) and am actively working on various other things that I know at least I will find useful in my research and teaching but hopefully others will also.

December 10, 2014 12:00 AM

November 22, 2014

Vince Knight

Thinking about divisibility by 11

This post is based on a class meeting I had recently with my programming class. It was based on trying to use code to help identify a condition that a number must obey for it to be divisible by 11. Readers of this blog might be aware that the following is incorrect but stick with me.

Exploring a statement

A number is divisible by 11 if and only if the alternating (in sign) sum of the number’s digits is 0.

To help with notation let us define \(f:x\to\text{alternating sum of digits of x}\) so for example we have:


It is immediate to note that for \(N< 100\) \(f(N)=0\) if and only if 11 divides \(N\) (\(11\;|\;N\) for short). Before trying to prove our statement we could check it for a few more numbers:

  • \(f(110)=0\)
  • \(f(121)=0\)
  • \(f(132)=0\)
  • \(f(143)=0\)
  • \(f(154)=0\)
  • \(f(165)=0\)
  • \(f(176)=0\)
  • \(f(187)=0\)
  • \(f(198)=0\)

So things are looking good! We could now rush off and try to prove that our statement is correct… or we could try more numbers. The easiest way to ‘try enough’ is to write some simple code (the following is written in Python):

class Experiment():
    A class for an experiment
    def __init__(self, N):
    Initialisation method:
    inputs: N - the number for which we will check the conjecture
        self.N = N
        self.divisible_by_11 = N % 11 == 0
        self.sum_of_consecutive_digits = sum([(-1) ** d *int(str(N)[d]) for d in range(len(str(N)))])
    def test_statement(self):
        Returns True if 'A number is divisible by 11 iff the alternating sum digits is 0' holds for this particular number.
        if self.divisible_by_11:
            return self.sum_of_consecutive_digits == 0
        return self.sum_of_consecutive_digits != 0

This creates a class for an Experiment for a given number, which has a couple of attributes relevant to what we’re trying to do:

>>> N = Experiment(121)
>>> N.divisible_by_11
>>> N.sum_of_consecutive_digits

There is also a method that checks the if and only if condition of our statement:

        if self.divisible_by_11:
            return self.sum_of_consecutive_digits == 0
        return self.sum_of_consecutive_digits != 0

So if the number is divisible by 11 then the statement is true if the sum is 0. If the number is however not divisible by 11 then the statement is true if the sum is not 0.

We can thus check for any given number if our statement is true:

>>> N = Experiment(121)
>>> N.test_satement()  # 121 is divisible by 11 and 1-2+1==0
>>> N = Experiment(122)
>>> N.test_statement()  # 122 is not divisible by 11 and 1-2+2!=0

So before attempting to prove anything algebraically let’s just check that it holds for the first 10000 numbers:

>>> all(Experiment(N).test_statement() for N in range(10001))

Disaster! It looks like our statement is not quite right!

The following might help us identify where (outputting a list of numbers for which the statement is false):

>>> [N for N in range(10001) if not Experiment(N).test_statement()]
[209, 308, 319, 407, 418, 429, 506, 517, 528, 539, 605, 616, 627, 638, 649, 704, 715, 726, 737, 748, 759, 803, 814, 825, 836, 847, 858, 869, 902, 913, 924, 935, 946, 957, 968, 979, 1309, 1408, 1419, 1507, 1518, 1529, 1606, 1617, 1628, 1639, 1705, 1716, 1727, 1738, 1749, 1804, 1815, 1826, 1837, 1848, 1859, 1903, 1914, 1925, 1936, 1947, 1958, 1969, 2090, 2409, 2508, 2519, 2607, 2618, 2629, 2706, 2717, 2728, 2739, 2805, 2816, 2827, 2838, 2849, 2904, 2915, 2926, 2937, 2948, 2959, 3080, 3091, 3190, 3509, 3608, 3619, 3707, 3718, 3729, 3806, 3817, 3828, 3839, 3905, 3916, 3927, 3938, 3949, 4070, 4081, 4092, 4180, 4191, 4290, 4609, 4708, 4719, 4807, 4818, 4829, 4906, 4917, 4928, 4939, 5060, 5071, 5082, 5093, 5170, 5181, 5192, 5280, 5291, 5390, 5709, 5808, 5819, 5907, 5918, 5929, 6050, 6061, 6072, 6083, 6094, 6160, 6171, 6182, 6193, 6270, 6281, 6292, 6380, 6391, 6490, 6809, 6908, 6919, 7040, 7051, 7062, 7073, 7084, 7095, 7150, 7161, 7172, 7183, 7194, 7260, 7271, 7282, 7293, 7370, 7381, 7392, 7480, 7491, 7590, 7909, 8030, 8041, 8052, 8063, 8074, 8085, 8096, 8140, 8151, 8162, 8173, 8184, 8195, 8250, 8261, 8272, 8283, 8294, 8360, 8371, 8382, 8393, 8470, 8481, 8492, 8580, 8591, 8690, 9020, 9031, 9042, 9053, 9064, 9075, 9086, 9097, 9130, 9141, 9152, 9163, 9174, 9185, 9196, 9240, 9251, 9262, 9273, 9284, 9295, 9350, 9361, 9372, 9383, 9394, 9460, 9471, 9482, 9493, 9570, 9581, 9592, 9680, 9691, 9790]

The first of those numbers is \(209=11\times19\) so it is divisible by 11 but \(f(209)=2-0+9=11\) and if we calculate \(f\) for a few more of the numbers in the above list we again get \(11\). At this point in time it seems like we need to adjust our statement.

Sufficient evidence for a conjecture

A number is divisible by 11 if and only if the alternating (in sign) sum of the number’s digits is divisible by 11.

A slight tweak of the Experiment code above gives:

class Experiment():
    A class for an experiment
    def __init__(self, N):
    Initialisation method:
    inputs: N - the number for which we will check the conjecture
        self.N = N
        self.divisible_by_11 = N % 11 == 0
        self.sum_of_consecutive_digits = sum([(-1) ** d *int(str(N)[d]) for d in range(len(str(N)))])
    def test_conjecture(self):
        Returns True if 'A number is divisible by 11 iff the alternating sum digits is 0' holds for this particular number.
        if self.divisible_by_11:
            return self.sum_of_consecutive_digits % 11 == 0
        return self.sum_of_consecutive_digits % 11 != 0

Now let us check the first 100,000 numbers:

>>> all(Experiment(N).test_conjecture() for N in range(100001))

When we have a lot of evidence for a mathematical statement we can (generally) start calling it a conjecture. At this point we probably can attempt to prove that the conjecture is true:


Let \(n_i\) be the \(i\)th digit of the \(m\) digit number \(N\), so we have \(N=\sum_{i=1}^{m}n_i10^{i-1}\). Using arithmetic modulo \(11\) we have:



The right hand side of that is of course just \(f(N)\) so \(11\;|\;N\) if and only iff \(11\;|\;f(N)\) (as required).

This is how a lot of mathematics gets done nowadays. Statements get made, then refined then checked and then finally (hopefully) proved. A nice book that describes a conjecture that stayed a conjecture for a long time (until ultimately being proved) is Proofs and Confirmations: The Story of the Alternating-Sign Matrix Conjecture by Bressoud.

November 22, 2014 12:00 AM

November 16, 2014

Martin Albrecht


Sage 6.4 is out. Here are some (to me) particularly exciting changes.

#16479: Vincent Delecroix: package for pip the Python installer

Sage now ships with pip which means you can use pip to install your favourite Python packages into the Sage environment, making it a lot easier to get access to the Python ecosystem. For example, say you want to use BatzenCA in Sage. You’d call

$ sage -pip install batzenca

and you are done.

Update: Sorry, I mispoke: pip is an optional package at the moment, not a standard package. You’ll have to sage -i pip it first.

#15915: Martin Albrecht: add discrete Gaussian samplers to Sage

I contributed code to sample from discrete Gaussian distributions to Sage. Discrete Gaussian distributions over the Integers sample integers proportionally to exp(-(x-c)²/(2σ²)), where c is the center and σ the Gaussian width parameter (roughly, the standard deviation). Here’s an example:

sage: from sage.stats.distributions.discrete_gaussian_integer import DiscreteGaussianDistributionIntegerSampler as DiscGauss
sage: D = DiscGauss(3.0)
sage: D() # output random by definition

Discrete Gaussian distributions are commonly used in lattice-based cryptography. For example, our GGHLite implementation makes use of the same code. That is, most of the code is written in C99 and also available as the stand-alone dgs library under a 2-clause BSD library.

I also implemented discrete Gaussians over arbitrary lattices. A discrete Gaussians over a lattice Λ is a distribution where point x occurs with probability:

\mbox{exp}(-|x-c|_2^2/(2\sigma^2))/(\sum_{x \in \Lambda} \mbox{exp}(-|x|_2^2/(2\sigma^2)))

However, this code not part of the dgs library but a pure Python module. A C99 implementation of the same algorithms as well as specialised algorithms for ideal lattices is part of the GGHLite implementation as the dgsl module, though.

#16803: Marc Masdeu: Reimplement matrix_integer_dense using FLINT

Marc switched over matrices over \mathbb{Z} to using FLINT as the native data type. Previously, we had our own data type (arrays of arrays of mpz_t), custom code and lots of conversions to other data types (LinBox, IML, …) to realise some functionality such as characteristic polynomials. FLINT is a sane default data type as it is quite fast for many operations. Conversions to LinBox et al. are still in place so functionality should not be lost. However, some decisions about defaults might be outdated now so report any issues on trac.

#16996: Volker Braun: IPython notebook with Sage Extensions

Thanks to Volker’s efforts you can now use the iPython notebook for Sage. To me the iPython notebook feels a lot more modern than the Sage notebook, partly because it is actively being developed whereas not much work is currently done on the native Sage notebook (it has seen some work done on it in the current release, though). I also like that iPython notebooks are files in your current directly which means you can keep them project local (others disagree about this design decision). To start the iPython notebook run:

$ sage -notebook=ipython

by martinralbrecht at November 16, 2014 08:46 PM

November 14, 2014

William Stein

SageMathCloud Notifications are Now Better

I just made live a new notifications systems for  SageMathCloud, which I spent all week writing.  

These notifications are what you see when you click the bell in the upper right.   This new system replaces the one I made live two weeks ago.     Whenever somebody actively *edits* (using the web interface) any file in any project you collaborate on, a notification will get created or updated.    If a person *comments* on any file in any project you collaborate on (using the chat interface to the right), then not only does the notification get updated, there is also a little red counter on top of the bell and also in the title of the  SageMathCloud tab.   In particular, people will now be much more likely to see the chats you make on files.

NOTE: I have not yet enabled any sort of daily email notification summary, but that is planned. 

Some technical details:  Why did this take all week?  It's because the technology that makes it work behind the scenes is something that was fairly difficult for me to figure out how to implement.  I implemented a way to create an object that can be used simultaneously by many clients and supports realtime synchronization.... but is stored by the distributed Cassandra database instead of a file in a project.   Any changes to that object get synchronized around very quickly.   It's similar to how synchronized text editing (with several people at once) works, but I rethought differential synchronization carefully, and also figured out how to synchronize using an eventually consistent database.    This will be useful for implementing a lot other things in SageMathCloud that operate at a different level than "one single project".  For example, I plan to add functions so you can access these same "synchronized databases" from Python processes -- then you'll be able to have sage worksheets (say) running on several different projects, but all saving their data to some common synchronized place (backed by the database).   Another application will be a listing of the last 100 (say) files you've opened, with easy ways to store extra info about them.    It will also be easy to make account and project settings more realtime, so when you change something, it automatically takes effect and is also synchronized across other browser tabs you may have open.   If you're into modern Single Page App web development, this might remind you of Angular or React or Hoodie or Firebase -- what I did this week is probably kind of like some of the sync functionality of those frameworks, but I use Cassandra (instead of MongoDB, say) and differential synchronization.  

I BSD-licensed the differential synchronization code  that I wrote as part of the above. 

by William Stein ( at November 14, 2014 02:31 PM

Vince Knight

Scaffolding tutors and how to better prepare for different pedagogies

Here’s my third reflective post for Imogen Dunne’s final year project (you can find the first here and the second here.

The course is now on the final straight as students have finished the class test which is always a bit of a milestone as they now begin to really individualise their learning experience (working on individually chosen projects etc…). As we get to this point I’ve got 3 specific things I’ve been thinking about:

  1. Class test performance

    With over half of the scripts marked it seems like students have done quite well on what is a difficult piece of assessment. It seems that the performance is overall better than last year but it’s too early to try and guess as to why that is.

  2. Scaffolding undergraduate tutors

    I use second year students to tutor in the lab sessions. They are all students who did the course last year. I don’t think I’ve done the best job explaining the roles of the tutors and this is something I need to get right next year. I still am very excited about using undergraduate tutors as I think it’s a brilliant experience for them as it continues their learning. The other (super important reason) is that I think this peer level of instruction is undoubtedly of benefit to the students on the course (if this wasn’t a quick rough drop of thoughts I’d be able to find a large quantity of resources relevant to this).

    The flipped approach used requires the tutoring to be very light touch, and more about giving feedback than giving ‘help’. I didn’t do the best job of explaining this as some undergraduate tutors felt it was their job to ‘get students code to work’. I’ll probably run a bit more of an ‘expansive’ training session with closer shadowing at the start.

    All the tutors have done a brilliant job and I’m very pleased, I just have to think carefully about how to best ‘scaffold’ and support them.

  3. Communicating the different pedagogy

    Some students were questioning the timing of my office hours: they are after the class meeting (which is in turn after the lab sessions). Whilst most students are on board and understand I think that the fact that a couple of students didn’t understand this shows that I didn’t do the best job of communicating the ideas.

    Involving the students in a discussion about the pedagogy they are experiencing is something I think is very important (especially when using something they might not have experienced before). As such I have tried and continue to try to talk about this throughout my interactions with students (from 1 to 1 meetings all the way to class meetings) but perhaps I could do more. Perhaps even just showing a picture like this would be helpful:

    That shows the reason why I have my class test at the end of a week, the idea being that most students will be happy with content through labs, some will be happy after the class meeting and a small amount will require specific time with me to help them through. This goes back to the premise of a flipped environment which aims to make best use of contact time: my class meetings are meant for us to further understanding and go towards the tip of Bloom’s taxonomy but also address specific difficulties.

On a whole I’m very happy with how this class is going, I feel that most students are engaging fully with the course and seem to be enjoying it. I also feel that I’ve gotten certain things a bit better than last year this year which could be an explanation of the slightly better mark in the class test (I still have 70 scripts to mark this weekend so it’s probably too early to tell).

One particularly nice piece of feedback is how students like the new class website (jekyll for the win!) and in particular have enjoyed the use of a comment system on all pages: it’s always nice to have that permanent record of discussions I’m having with the students which in turn could help other students. Ideally the discussion would be peer to peer but that hasn’t happened much this year (mainly me answering queries) but it has happened once which I’m happy about (small steps, big wins) :)

November 14, 2014 12:00 AM

October 25, 2014

Vince Knight

Busy office hours

Here’s my second reflective post for Imogen Dunne’s final year project (you can find the first here).

We are now at the end of Week 4 of the course and I’m glad to say that I think overall things are going well.

  • My office hours have gotten very busy. This is awesome. Students are coming to see me, genuinely having struggled with concepts and this is often a result of myself or another tutor identifying specific issues in a lab session and saying something like:

    I’ll give you the tick for that but come and see me during office hours to talk about it as I think you’re a bit confused there.

  • I think students are understanding now that the role of the ‘ticks’ is to help identify difficulties. I need to do a better job next year at explaining this from the offset. It might help to put it at the top of every lab sheet… I’ll think about this…

  • Imogen’s focus group went really well with 15 students participating. I still need to carefully reflect on the particular issues that were raised. I feel again that some need to be addressed through a better communication on my part (for example students wanted to have a list of alternative resources, there is such a list on the class site and I thought I’d mentioned it sufficiently but I will need to do that better).

  • There are a large number of tutors now and with that comes the tricky task of ensuring they understand their role. Jason was mentioned in Imogen’s focus group as doing a great job as “he didn’t just give the answer but pushed students to identify their difficulties”. I need to think very carefully about how to get this across to all the tutors (I think this has been addressed for the current year).

  • I’ve been thinking about the resources and am thinking that I might create a second set of videos. This would be a large quantity of work (3/4 weekends probably) but I think I could really help student further. In a way it seems logical to do after having run the course twice. This would further ‘flip the class’…

  • On another good note not entirely irrelevant to the class: code club is going well! The last session was a very busy one and students are actively participating which is great. Some first years are starting to work on the Euler problems and other formed a bit of a revision group for the upcoming class test they have… You can see the site (that has been really nicely stylised by the students):

October 25, 2014 12:00 AM

October 17, 2014

William Stein

A Non-technical Overview of the SageMathCloud Project

What problems is the SageMathCloud project trying to solve? What pain points does it address? Who are the competitors and what is the state of the technology right now?

What problems you’re trying to solve and why are these a problem?

  • Computational Education: How can I teach a course that involves mathematical computation and programming?
  • Computational Research: How can I carry out collaborative computational research projects?
  • Cloud computing: How can I get easy user-friendly collaborative access to a remote Linux server?

What are the pain points of the status quo and who feels the pain?

  • Student/Teacher pain:
    • Getting students to install software needed for a course on their computers is a major pain; sometimes it is just impossible, due to no major math software (not even Sage) supporting all recent versions of Windows/Linux/OS X/iOS/Android.
    • Getting university computer labs to install the software you need for a course is frustrating and expensive (time and money).
    • Even if computer labs worked, they are often being used by another course, stuffy, and students can't possibly do all their homework there, so computation gets short shrift. Lab keyboards, hardware, etc., all hard to get used to. Crappy monitors.
    • Painful confusing problems copying files around between teachers and students.
    • Helping a student or collaborator with their specific problem is very hard without physical access to their computer.
  • Researcher pain:
    • Making backups every few minutes of the complete state of everything when doing research often hard and distracting, but important for reproducibility.
    • Copying around documents, emailing or pushing/pulling them to revision control is frustrating and confusing.
    • Installing obscuring software is frustarting and distracting from the research they really want to do.
  • Everybody:
    • It is frustrating not having LIVE working access to your files wherever you are. (Dropbox/Github doesn't solve this, since files are static.)
    • It is difficult to leave computations running remotely.

Why is your technology poised to succeed?

  • When it works, SageMathCloud solves every pain point listed above.
  • The timing is right, due to massive improvements in web browsers during the last 3 years.
  • I am on full sabbatical this year, so at least success isn't totally impossible due to not working on the project.
  • I have been solving the above problems in less scalable ways for myself, colleagues and students since the 1990s.
  • SageMathCloud has many users that provide valuable feedback.
  • We have already solved difficult problems since I started this project in Summer 2012 (and launched first version in April 2013).

Who are your competitors?

There are no competitors with a similar range of functionality. However, there are many webapps that have some overlap in capabilities:
  • Mathematical overlap: Online Mathematica: "Bring Mathematica to life in the cloud"
  • Python overlap: Wakari: "Web-based Python Data Analysis"; also PythonAnywhere
  • Latex overlap: ShareLaTeX, WriteLaTeX
  • Web-based IDE's/terminals: target writing webapps (not research or math education):,,,
  • Homework: WebAssign and WebWork
Right now, SageMathCloud gives away for free far more than any other similar site, due to very substantial temporary financial support from Google, the NSF and others.

What’s the total addressable market?

Though our primary focus is the college mathematics classroom, there is a larger market:
  • Students: all undergrad/high school students in the world taking a course involving programming or mathematics
  • Teachers: all teachers of such courses
  • Researchers: anybody working in areas that involve programming or data analysis
Moreover, the web-based platform for computing that we're building lends itself to many other collaborative applications.

What stage is your technology at?

  • The site is up and running and has 28,413 monthly active users
  • There are still many bugs
  • I have a precise todo list that would take me at least 2 months fulltime to finish.

Is your solution technically feasible at this point?

  • Yes. It is only a matter of time until the software is very polished.
  • Morever, we have compute resources to support significantly more users.
  • But without money (from paying customers or investment), if growth continues at the current rate then we will have to clamp down on free quotas for users.

What technical milestones remain?

  • Infrastructure for creating automatically-graded homework problems.
  • Fill in lots of details and polish.

Do you have external credibility with technical/business experts and customers?

  • Business experts: I don't even know any business experts.
  • Technical experts: I founded the Sage math software, which is 10 years old and relatively well known by mathematicians.
  • Customers: We have no customers; we haven't offered anything for sale.

Market research?

  • I know about math software and its users as a result of founding the Sage open source math software project, NSF-funded projects I've been involved in, etc.

Is the intellectual property around your technology protected?

  • The IP is software.
  • The website software is mostly new Javascript code we wrote that is copyright Univ. of Washington and mostly not open source; it depends on various open source libraries and components.
  • The Sage math software is entirely open source.

Who are the team members to move this technology forward?

I am the only person working on this project fulltime right now.
  • Everything: William Stein -- UW professor
  • Browser client code: Jon Lee, Andy Huchala, Nicholas Ruhland -- UW undergrads
  • Web design, analytics: Harald Schilly -- Austrian grad student
  • Hardware: Keith Clawson

Why are you the ideal team?

  • We are not the ideal team.
  • If I had money maybe I could build the ideal team, leveraging my experience and connections from the Sage project...

by William Stein ( at October 17, 2014 01:04 PM

October 16, 2014

William Stein

Public Sharing in SageMathCloud, Finally

SageMathCloud (SMC) is a free (NSF, Google and UW supported) website that lets you collaboratively work with Sage worksheets, IPython notebooks, LaTeX documents and much, much more. All work is snapshotted every few minutes, and copied out to several data centers, so if something goes wrong with a project running on one machine (right before your lecture begins or homework assignment is due), it will pop up on another machine. We designed the backend architecture from the ground up to be very horizontally scalable and have no single points of failure.

This post is about an important new feature: You can now mark a folder or file so that all other users can view it, and very easily copy it to their own project.

This solves problems:
  • Problem: You create a "template" project, e.g., with pre-installed software, worksheets, IPython notebooks, etc., and want other users to easily be able to clone it as a new project. Solution: Mark the home directory of the project public, and share the link widely.

  • Problem: You create a syllabus for a course, an assignment, a worksheet full of 3d images, etc., that you want to share with a group of students. Solution: Make the syllabus or worksheet public, and share the link with your students via an email and on the course website. (Note: You can also use a course document to share files with all students privately.) For example...

  • Problem: You run into a problem using SMC and want help. Solution: Make the worksheet or code that isn't working public, and post a link in a forum asking for help.
  • Problem: You write a blog post explaining how to solve a problem and write related code in an SMC worksheet, which you want your readers to see. Solution: Make that code public and post a link in your blog post.
Here's a screencast.

Each SMC project has its own local "project server", which takes some time to start up, and serves files, coordinates Sage, terminal, and IPython sessions, etc. Public sharing completely avoids having anything to do with the project server -- it works fine even if the project server is not running -- it's always fast and there is no startup time if the project server isn't running. Moreover, public sharing reads the live files from your project, so you can update the files in a public shared directory, add new files, etc., and users will see these changes (when they refresh, since it's not automatic).
As an example, here is the cloud-examples github repo as a share. If you click on it (and have a SageMathCloud account), you'll see this:

What Next?

There is an enormous amount of natural additional functionality to build on top of public sharing.

For example, not all document types can be previewed in read-only mode right now; in particular, IPython notebooks, task lists, LaTeX documents, images, and PDF files must be copied from the public share to another project before people can view them. It is better to release a first usable version of public sharing before systematically going through and implementing the additional features needed to support all of the above. You can make complicated Sage worksheets with embedded images and 3d graphics, and those can be previewed before copying them to a project.
Right now, the only way to visit a public share is to paste the URL into a browser tab and load it. Soon the projects page will be re-organized so you can search for publicly shared paths, see all public shares that you have previously visited, who shared them, how many +1's they've received, comments, etc.

Also, I plan to eventually make it so public shares will be visible to people who have not logged in, and when viewing a publicly shared file or directory, there will be an option to start it running in a limited project, which will vanish from existence after a period of inactivity (say).

There are also dozens of details that are not yet implemented. For example, it would be nice to be able to directly download files (and directories!) to your computer from a public share. And it's also natural to share a folder or file with a specific list of people, rather than sharing it publicly. If somebody is viewing a public file and you change it, they should likely see the update automatically. Right now when viewing a share, you don't even know who shared it, and if you open a worksheet it can automatically execute Javascript, which is potentially unsafe.  Once public content is easily found, if somebody posts "evil" content publicly, there needs to be an easy way for users to report it.

Sharing will permeate everything

Sharing has been thought about a great deal during the last few years in the context of sites such as Github, Facebook, Google+ and Twitter. With SMC, we've developed a foundation for interactive collaborative computing in a browser, and will introduce sharing on top of that in a way that is motivated by your problems. For example, as with Github or Google+, when somebody makes a copy of your publicly shared folder, this copy should be listed (under "copies") and it could start out public by default. There is much to do.

One reason it took so long to release the first version of public sharing is that I kept imagining that sharing would happen at the level of complete projects, just like sharing in Github. However, when thinking through your problems, it makes way more sense in SMC to share individual directories and files. Technically, sharing at this level works works well for read-only access, not for read-write access, since projects are mapped to Linux accounts. Another reason I have been very hesitant to support sharing is that I've had enormous trouble over the years with spammers posting content that gets me in trouble (with my University -- it is illegal for UW to host advertisements). However, by not letting search engines index content, the motivation for spammers to post nasty content is greatly reduced.

Imagine publicly sharing recipes for automatically gradable homework problems, which use the full power of everything installed in SMC, get forked, improved, used, etc.

by William Stein ( at October 16, 2014 02:29 PM

October 13, 2014

Vince Knight

Reflecting on a first week of learning

This academic year Imogen Dunne (a final year student here at Cardiff University) is doing her final year project looking at evaluating student attitudes in my “Computing for Mathematics” course.

The idea is to use questionnaires, focus groups and interviews to evaluate (longitudinally) mathematics students’ attitudes towards:

  • Learning to code;
  • Learning some early mathematics;
  • Learning in a flipped class environment.

This has started off well, and last week Imogen kicked off a meeting with me by giving me homework. Something along the lines of:

I’d like to see the attitudes of everyone involved from every point of view. Could you perhaps write a diary/log at certain points during the year, describing how you feel things are going?

I was delighted with this idea and thought that I might as well blog these reflections, so here it goes:

A good first week.

The first thing I will say is I think this first week went really well. Students turned up to their lab sessions having almost universally carried out all the required work which was awesome. There was one major change with last year shifting some of the content from each week to the next. This had the effect of making the first week a bit lighter which I think has been a good thing.

Some labs could have been a bit ‘noisier’.

I want labs sessions to be noisy spaces with students talking to each other and figuring things out. This happened in most of the lab sessions I got to see but there was one or two where it felt more like a high school class from when I was still in high school: students looking at me as if I was a teacher evaluating whatever they were saying. I’m using 2nd year students as tutors this year (which I’m super excited about: more about that later) and perhaps I need to do a better job explaining exactly what it is that I want the labs to be. I think this has now been addressed.

The best flipped class meeting I’ve ever had.

I strive for a student centred learning environment. This isn’t always easy to obtain but the first class meeting went extremely well. I came in to the meeting expecting us to talk about the index method on strings (which finds the location of the first occurrence of a substring in a string) as this was the main piece of feedback I had from the labs as to the difficulties of the students. We started talking about it that but then the students really took over and wanted to know how to calculate the location of all the subtrings. This was a really awesome session as we went in to a particular thing in much more detail than we would have otherwise. Most importantly students would never have been able to have that conversation with me if they knew nothing about strings and the index function…

Not leaving anyone behind.

When teaching in a classic lecture based setting it’s extremely difficult to gain an understanding of how your students are doing. This flipped environment is all about finding out how students are doing, I am constantly getting feedback as to what students are having difficulty with. I have to make sure that students understand that that is what the labs are for, I’m constantly saying this but will continue to do so.

Further more there are some students who are having difficulties with the content, this is completely expected but the great thing about teaching using this approach is that I’ve already been able to identify them and will be meeting with them during my office hours to help (I’m always happy to help students who want to work).

Big thanks to the tutors.

Finally, this week of class would not have gone so smoothly if it weren’t for the great tutors who have helped in the labs. These include some of my colleagues who have gratefully given up their time, Jason Young who has done an awesome job organising the undergraduate tutors and most importantly the undergraduate tutors themselves. They all did a great job and I hope will also continue to learn and enjoy writing code.

This has been written down in a slight rush before my next set of labs but hopefully will be useful to Imogen (and indeed perhaps my students).

October 13, 2014 12:00 AM

October 12, 2014

Vince Knight

A playlist of 21 videos introducing LaTeX using SageMathCloud

About a year ago I put together 21 videos showing very basic LaTeX syntax. I used writelatex as the platform for those videos.

You can see that playlist of videos here.

This year I’m planning on using SageMathCloud when I teach Sage to my first years so I thought it was probably worth showing the students LaTeX in the same environment. So I’ve just finished redo’ing the same playlist of 21 videos but this time using SageMathCloud.

You can find that playlist here:

I’ve put all the tex files I created in the video in a github repository so they can be easily cloned in a SageMathCloud project using the https url:

October 12, 2014 12:00 AM

October 04, 2014

Vince Knight

A list of podcasts I listen to

I’ve recently gotten in to podcasts again (again) and really enjoy listening to most of these in the background of whatever it is I’m doing.

The podcasts I listen to probably fall in to the following categories:

  • Technology
  • Science
  • Sport

I’m sure I’m missing a bunch so I thought I’d write a blog post in the hope that kind readers would let me know what I should be listening to.


  • Tech News Today: One of many twit shows I like. I usually listen to this first thing in the morning.
  • This week in tech: I really like Leo Laporte and enjoy listening to this show as background noise to whatever it is that I’m doing.
  • Linux action show: I sometime find these guys ‘defend’ linux on the desktop a bit too fervently but I never fail to learn something new.
  • MacBreak Weekly: This has the same kind of vibe as ‘this week in tech’.
  • All about android: Again, this one feels like hanging out with friends (the probably with real friends being that I can’t work at the same time but I’m working on that).
  • Security Now: Another twit podcast, this one is about security and although I miss a lot about what is talked about I still enjoy it.
  • Le Rendez-Vous Tech: A French podcast very similar to this week in tech. Probably the only bit of exercise my French still gets.
  • Healthy Hacker: This is probably one of my favourite podcasts. Chris talks about code but also general fitness stuff.
  • Not really a podcast (low frequency) but it’s on my podcast catcher so I thought I’d list it here.


  • Freakonomics Radio: A great podcast, extremely well produced and often covering really interesting subjects (this one I usually try not to listen to in the background like most of the above).
  • The infinite monkey cage: Another really good job. Often very funny as well as informative.
  • This week in Science: this is always a nice listen to as they catch up on scientific stories that happened throughout the week.
  • Pythagoras Trousers: Cardiff University graduate Rhys Phillips does an excellent job with this podcast.
  • Math Ed Podcast: A really nice show that in every episode I’ve listened to was a really interesting interview of a mathematical education expert.
  • Stuff you should know: Not too sure if I should put this in this category but it’s a cool show where they explain a bunch of topics.


  • First Take: I don’t get to follow much US sports but I do enjoy listening to Slip and Stephen A. yell at each other.
  • Pardon the Interruption: Same as above really (although with less yelling).
  • Fighting Talk: This is a fun BBC show where sports pundits get points for being funny talking about stuff that happened in the past week.
  • Around the NFL: I like the NFL and when I have time to listen to this one I get to find out what’s happening.
  • Egg chasers podcast: A really great weekly roundup of rugby.
  • Scrum V Radio: A specific podcast about Welsh rugby.

One I’m forgetting in all of that is Relatively Prime which is a mathematics podcast that has just started a kickstarter campaign for a second season. I’ve never listened to it but though it could be worth mentioning.

Any good ones I’m missing?

October 04, 2014 12:00 AM

October 01, 2014

William Stein

SageMathCloud Course Management

by William Stein ( at October 01, 2014 01:05 PM

September 27, 2014

Sébastien Labbé

Abelian complexity of the Oldenburger sequence

The Oldenburger infinite sequence [O39] \[ K = 1221121221221121122121121221121121221221\ldots \] also known under the name of Kolakoski, is equal to its exponent trajectory. The exponent trajectory \(\Delta\) can be obtained by counting the lengths of blocks of consecutive and equal letters: \[ K = 1^12^21^22^11^12^21^12^21^22^11^22^21^12^11^22^11^12^21^22^11^22^11^12^21^12^21^22^11^12^21^12^11^22^11^22^21^12^21^2\ldots \] The sequence of exponents above gives the exponent trajectory of the Oldenburger sequence: \[ \Delta = 12211212212211211221211212\ldots \] which is equal to the original sequence \(K\). You can define this sequence in Sage:

sage: K = words.KolakoskiWord()
sage: K
word: 1221121221221121122121121221121121221221...
sage:          # delta returns the exponent trajectory
word: 1221121221221121122121121221121121221221...

There are a lot of open problem related to basic properties of that sequence. For example, we do not know if that sequence is recurrent, that is, all finite subword or factor (finite block of consecutive letters) always reappear. Also, it is still open to prove whether the density of 1 in that sequence is equal to \(1/2\).

In this blog post, I do some computations on its abelian complexity \(p_{ab}(n)\) defined as the number of distinct abelian vectors of subwords of length \(n\) in the sequence. The abelian vector \(\vec{w}\) of a word \(w\) counts the number of occurences of each letter: \[ w = 12211212212 \quad \mapsto \quad 1^5 2^7 \text{, abelianized} \quad \mapsto \quad \vec{w} = (5, 7) \text{, the abelian vector of } w \]

Here are the abelian vectors of subwords of length 10 and 20 in the prefix of length 100 of the Oldenburger sequence. The functions abelian_vectors and abelian_complexity are not in Sage as of now. Code is available at trac #17058 to be merged in Sage soon:

sage: prefix = words.KolakoskiWord()[:100]
sage: prefix.abelian_vectors(10)
{(4, 6), (5, 5), (6, 4)}
sage: prefix.abelian_vectors(20)
{(8, 12), (9, 11), (10, 10), (11, 9), (12, 8)}

Therefore, the prefix of length 100 has 3 vectors of subwords of length 10 and 5 vectors of subwords of length 20:

sage: p100.abelian_complexity(10)
sage: p100.abelian_complexity(20)

I import the OldenburgerSequence from my optional spkg because it is faster than the implementation in Sage:

sage: from slabbe import KolakoskiWord as OldenburgerSequence
sage: Olden = OldenburgerSequence()

I count the number of abelian vectors of subwords of length 100 in the prefix of length \(2^{20}\) of the Oldenburger sequence:

sage: prefix = Olden[:2^20]
sage: %time prefix.abelian_vectors(100)
CPU times: user 3.48 s, sys: 66.9 ms, total: 3.54 s
Wall time: 3.56 s
{(47, 53), (48, 52), (49, 51), (50, 50), (51, 49), (52, 48), (53, 47)}

Number of abelian vectors of subwords of length less than 100 in the prefix of length \(2^{20}\) of the Oldenburger sequence:

sage: %time L100 = map(prefix.abelian_complexity, range(100))
CPU times: user 3min 20s, sys: 1.08 s, total: 3min 21s
Wall time: 3min 23s
sage: from collections import Counter
sage: Counter(L100)
Counter({5: 26, 6: 26, 4: 17, 7: 15, 3: 8, 8: 4, 2: 3, 1: 1})

Let's draw that:

sage: labels = ('Length of factors', 'Number of abelian vectors')
sage: title = 'Abelian Complexity of the prefix of length $2^{20}$ of Oldenburger sequence'
sage: list_plot(L100, color='green', plotjoined=True, axes_labels=labels, title=title)

It seems to grow something like \(\log(n)\). Let's now consider subwords of length \(2^n\) for \(0\leq n\leq 12\) in the same prefix of length \(2^{20}\):

sage: %time L20 = [(2^n, prefix.abelian_complexity(2^n)) for n in range(20)]
CPU times: user 41 s, sys: 239 ms, total: 41.2 s
Wall time: 41.5 s
sage: L20
[(1, 2), (2, 3), (4, 3), (8, 3), (16, 3), (32, 5), (64, 5), (128, 9),
(256, 9), (512, 13), (1024, 17), (2048, 22), (4096, 27), (8192, 40),
(16384, 46), (32768, 67), (65536, 81), (131072, 85), (262144, 90), (524288, 104)]

I now look at subwords of length \(2^n\) for \(0\leq n\leq 23\) in the longer prefix of length \(2^{24}\):

sage: prefix = Olden[:2^24]
sage: %time L24 = [(2^n, prefix.abelian_complexity(2^n)) for n in range(24)]
CPU times: user 20min 47s, sys: 13.5 s, total: 21min
Wall time: 20min 13s
sage: L24
[(1, 2), (2, 3), (4, 3), (8, 3), (16, 3), (32, 5), (64, 5), (128, 9), (256,
9), (512, 13), (1024, 17), (2048, 23), (4096, 33), (8192, 46), (16384, 58),
(32768, 74), (65536, 98), (131072, 134), (262144, 165), (524288, 229),
(1048576, 302), (2097152, 371), (4194304, 304), (8388608, 329)]

The next graph gather all of the above computations:

sage: G = Graphics()
sage: legend = 'in the prefix of length 2^{}'
sage: G += list_plot(L24, plotjoined=True, thickness=4, color='blue', legend_label=legend.format(24))
sage: G += list_plot(L20, plotjoined=True, thickness=4, color='red', legend_label=legend.format(20))
sage: G += list_plot(L100, plotjoined=True, thickness=4, color='green', legend_label=legend.format(20))
sage: labels = ('Length of factors', 'Number of abelian vectors')
sage: title = 'Abelian complexity of Oldenburger sequence'
sage:'semilogx', 2), axes_labels=labels, title=title)

A linear growth in the above graphics with logarithmic \(x\) abcisse would mean a growth in \(\log(n)\). After those experimentations, my hypothesis is that the abelian complexity of the Oldenburger sequence grows like \(\log(n)^2\).


[O39]Oldenburger, Rufus (1939). "Exponent trajectories in symbolic dynamics". Transactions of the American Mathematical Society 46: 453–466. doi:10.2307/1989933

by Sébastien Labbé at September 27, 2014 10:00 PM

September 24, 2014

Vince Knight

Grey scale network graph colorings in Sage

This is a quick post following a request for some Sage help that a colleague asked for. It’s based on a quick fix and I’m wondering if someone might come up with a better way of doing this that I’ve missed or if it’s worth actually raising a ticket to incorporate something like it in Sage.

So my colleague is writing a book on Graph theory and recently started taking a look at Sage’s capacity to handle Graph theory stuff and colorings in particular. The issue was that said colleague ideally wanted grey scale pictures of the colorings (I’m guessing this is due to the publisher or something - I didn’t ask).

The following creates a bespoke graph and plots a coloring (ensures that adjacent vertices have different colors):

sage: P = graphs.PetersenGraph()
sage: c = P.coloring()
sage: c
[[1, 3, 5, 9], [0, 2, 6], [4, 7, 8]]

Now to get that in to grey scale we could of course open up inkscape or something similar and convert it but it would be nice to be able to directly use something like the matplotlib grey scale color map. This is in fact what I started to look for but with no success so I then started to look for how one converts an RGB tuple (3 floats corresponding to the makeup of a color) to something on a grey scale.

Turns out (see this stackoverflow question which leads to this wiki page), that for \(\text{rgb}=(r,g,b)\), the corresponding grey scale color is given by \(\text{grey}=(Y,Y,Y)\) where \(Y\) is given by:

where \(y\) is given by:

I genuinely have no understanding what so ever as to what that does but the idea is to make use of the Sage rainbow function which returns a given number of colors (very useful for creating plots when you don’t necessarily want to come up with all the color names).

sage: rainbow(10, 'rgbtuple')
[(1.0, 0.0, 0.0),
 (1.0, 0.6000000000000001, 0.0),
 (0.7999999999999998, 1.0, 0.0),
 (0.20000000000000018, 1.0, 0.0),
 (0.0, 1.0, 0.40000000000000036),
 (0.0, 1.0, 1.0),
 (0.0, 0.40000000000000036, 1.0),
 (0.1999999999999993, 0.0, 1.0),
 (0.8000000000000007, 0.0, 1.0),
 (1.0, 0.0, 0.5999999999999996)]

So here’s a function that takes the output of rainbow and maps it to grey scale:

def grey_rainbow(n, black=False):
    Return n greyscale colors
    if black:
        clrs = [0.299*clr[0] + 0.587 * clr[1] + 0.114 * clr[2] for clr in rainbow(n-2,'rgbtuple')]
        clrs = [0.299*clr[0] + 0.587 * clr[1] + 0.114 * clr[2] for clr in rainbow(n-1,'rgbtuple')]
    output = ['white']
    for c in clrs:
        if c <= 0.0031308:
            rgb = 12.92 * c
            rgb = (1.055*c^(1/2.4)-0.055)
    if black:
    return output

Note that we’re including the option to use black as one of the colours or not (it covers up the vertex labels on the corresponding plot if we do). We can then use that function to create our own partition coloring:

def grey_coloring(G, black=False):
    chromatic_nbr = G.chromatic_number()
    coloring = G.coloring()
    grey_colors = grey_rainbow(chromatic_nbr, black)
    d = {}
    for i, c in enumerate(grey_colors):
        d[c] = coloring[i]
    return P.graphplot(vertex_colors=d)

Here is how we can simply use the above to get a grey scale coloring of a graph:

P = graphs.PetersenGraph()
p = grey_coloring(P)

P = graphs.PetersenGraph()
p = grey_coloring(P,black=True)

Now what would be really nice would be to be able to just use any matplotlib color map in the Graph coloring. This might actually already be possible, I’ll fish through the Sage source code at some point and take a look (the awesome thing about Sage is that I can do that). Otherwise, it might just be a quick fix (and hopefully a less hacky one then above - I still laugh at the formulae I use and that seems to work), who nows I might even see if it’s worth opening an actual ticket for this.

Here is a Sage script with the above code

September 24, 2014 12:00 AM

September 21, 2014

Vince Knight

My thoughts on plotly (the github for plots)

A while ago I saw plotly appear on my G+ stream. People seemed excited about it but I was too busy to look at it properly and just assumed: must be some sort of new matplotlib: ain’t nobody got time for that!

Then, one of the guys from plotly reached out saying I should take a look. I took a brief glance and realised that this was nothing like a new matplotlib and in fact looked pretty cool. So I dutifully put it on my to do list but very much near the bottom.

I’m writing this sat in between sessions at PyconUK 2014. One of the talks on the first day was by Chris from plotly. He gave a great talk (which once the video link is up I’ll share here) and I immediately threw ‘check out plotly’ to the top of my to do list.

How I got started

import matplotlib.pyplot as plt
import matplotlib.mlab as mlab
import numpy as np
import plotly.plotly as py

n = 50
x, y, z, s, ew = np.random.rand(5, n)
c, ec = np.random.rand(2, n, 4)
area_scale, width_scale = 500, 5

fig, ax = plt.subplots()
sc = ax.scatter(x, y, c=c,

plot_url = py.plot_mpl(fig)

That code automatically creates the following plotly plot (which you can edit, zoom in etc…):

By ‘automatically’ I mean: ‘opens up web browser and your plots is there’!!

Doing something of my own

In my previous post I wrote about how to use Markov Chains to obtain the expected wait in a tandem qeue. Here’s a plot I put together that compared the analytical values to simulated values:

The code to obtain that particular plot is below:

# Libraries
from matplotlib.pyplot import plt
import csv
import plotly.plotly as py

# Get parameters

c1   = 12
N    = 9
c2   = 12
mu_1 = 1
mu_2 = .2
p    = .5

# Read analytical data
analytical_data = [[float(k) for k in row] for row in csv.reader(open('analytical.csv', 'r'))]

# Read simulation data
simulation_data = [[float(k) for k in row] for row in csv.reader(open('simulated.csv', 'r'))]

# Create the plot

fig = plt.figure()
ax = plt.subplot(111)
x_sim = [row[0] for row in simulation_data[::10]]  # The datasets have more data than I want to plot so skipping some values
y_sim = [row[1:] for row in simulation_data[::10]]
ax.boxplot(y_sim, positions=x_sim)
x_ana = [row[0] for row in analytical_data if row[0] <= max(x_sim)]
y_ana = [row[1] for row in analytical_data[:len(x_ana)]]
plt.xticks(range(0,int(max(x_ana) + max(int(max(x_ana)) / 10,1)), max(int(max(x_ana)) / 10,1)))
ax.set_ylabel('Mean expected wait')
title="$c_1=%s,\; N=%s,\; c_2=%s,\;\mu_1=%s,\; \mu_2=%s,\; p=%s $" % (c1, N, c1, mu_1, mu_2, p)

# Save the plot as a pdf file locally
plt.savefig("%s-%s-%s-%s-%s-%s.pdf" % (int(c1), int(N), int(c1), mu_1, mu_2, p), bbox_inches='tight')

plot_url = py.plot_mpl(fig)

If you’d like to repeat the above you can download the analytical and simulated datafiles.

The result of that can be seen here:

Further more that is just a ‘thing’ on my plotly profile so you can see it at this url:

Getting other formats

On that page I can tweak the graph if I wanted to and finally I can just grab the plot in whatever format I want by simply adding the correct format extension to the url:

My overall thoughts

So right now I’m just kind of excited about the possibilities (too many ideas to coherently filter out the good ones), there are also packages for R so I might try and get my students to play around with it in R when I teach it…

As a research tool, I think this will also be nice (it’s certainly the way to go). Although recently, I’ve been working remotely with two students and being able to throw a png of a plot in a hangout chat is pretty cool (and mobile friendly). So maybe that’s something the plotly guys could think about…

At the end of the day: this is an awesome tool. Plotly ‘abstractifies’ plots so that people using different packages/languages can still talk to each other. One of the big things I’m forgetting to talk about in detail is that there’s a web tool that allows you to change colors, change titles, mess with the data etc. That’s also a very cool collaborative tool obviously as I can imagine throwing up a plot that a co-author who doesn’t like code could then tweak.

Similarly (if/when) publications start using smarter formats (than things that are restricted by the need to be printed on paper) you could even just embed the plots like I’ve done here (so people could zoom, grab the data etc…). Here’s another way I could put that:

Papers are where plots go to die, they can go to plotly to live…

Woops, I’ve started blurting out some ideas… Hopefully they’re good ones.

I look forward to playing around with this tool some more (I need to see how it behaves with a Sage plot…).

September 21, 2014 12:00 AM

September 19, 2014

Vince Knight

Calculating the expected wait in a tandem queue with blocking using Sage

In this post I’ll describe a particular mathematical model that I’ve been working on for the purpose of a research paper. This is actually part of some work that I’ve done with +James Campbell an undergraduate who worked with me over the Summer.

Consider the system shown in the picture below:

This is a system composed of ‘two queues in tandem’ each with a number of servers \(c\) and a service rate \(\mu\). It is assumed here (as is often the case in queueing theory) that the service rate is exponentially distributed (in effect that the service time is random with mean \(1/\mu\)).

Individuals arrive at the queue with mean arrival rate \(\Lambda\) (once again exponentially distributed).

There is room for up to \(N\) individuals to wait for a free server at the first station. After their service at the first station is complete, individuals leave the system with probability \(p\), but if they don’t and there is no free place in the next station (ie there are not less than \(c_2\) individuals in the second service center) then they become blocked.

There are a vast array of applications of queueing systems like the above (in particular in the paper we’re working on we’re using it to model a healthcare system).

In this post I’ll describe how to use Markov chains to be able to describe the system and in particular how to get the expected wait for an arrival at the queue.

I have discussed Markov chains before (mainly posts on my old blog) and so I won’t go in to too much detail about the underlying theory (I think this is perhaps a slightly technical post so it is mainly aimed at people familiar with queueing theory but by all means ask in the comments if I can clarify anything).

The state space

One has to think about this carefully as it’s important to keep track not only of where individuals are but whether or not they are blocked. Based on that one might think of using a 3 dimensional state space: \((i,j,k)\) where \(i\) would denote the number of people at the first station, \(j\) the number at the second and \(k\) the number at the first who are blocked. This wouldn’t be a terrible idea but we can actually do things in a slightly neater and more compact way:

In the above \(i\) denotes the number of individuals in service or waiting for service at the first station and \(j\) denotes the number of individuals in service at the second station or blocked at the first station.

The continuous time transition rates between two states \((i_1,j_1)\) and \((i_2, j_2)\) are given by:

where \(\delta=(i_2,j_2)-(i_1,j_1)\).

Here’s a picture of the Markov Chain for \(N=c_1=c_2=2\):

The steady state probabilities

Using the above we can index our states and keep all the information in a matrix \(Q\), to obtain the steady state distributions of the chain (the probabilities of finding the queue in a given state) we then simply solve the following equation:

subject to \(\sum \pi = 1\).

Here’s how we can do this using Sage:

class Tandem_Queue():
        A class for an instance of the tandem_queue
        def __init__(self, c_1, N, c_2, Lambda, mu_1, mu_2, p):
            self.c_1 = c_1
            self.c_2 = c_2
            self.N = N
            self.Lambda = Lambda
            self.mu_1 = mu_1
            self.mu_2 = mu_2
            self.p = p
            self.m = c_1 + c_2 + 1
            self.n = c_2 + 1
            self.state_space = [(i, j)  for j in range(c_1 + c_2 + 1) for i in range(self.c_1 + self.N - max(j - self.c_2, 0) + 1)]
            if p == 1:  # Reduces state space in particular case of p = 1
                self.state_space = [state for state in self.state_space if state[1] == 0]
            Q = [[self.q(state1, state2) for state2 in self.state_space] for state1 in self.state_space]
            for i in range(len(Q)):
                Q[i][i] = - sum(Q[i])
            self.Q = matrix(QQ, Q)
            self.expected_wait_cache = {}

        def q(self, state1, state2):
            Returns the rate of transition between to given states.
            delta = list(vector(state2) - vector(state1))
            if delta == [1, 0]:
                return self.Lambda
            if delta == [-1, 1]:
                return min(self.c_1 - max(state1[1] - self.c_2, 0), state1[0]) * self.mu_1 * (1 - self.p)
            if delta == [-1, 0]:
                return min(self.c_1 - max(state1[1] - self.c_2, 0), state1[0]) * self.mu_1 * self.p
            if delta == [0, -1]:
                return min(state1[1], self.c_2) * self.mu_2
            return 0

        def pi(self):
            Solves linear system.
            A = transpose(self.Q).stack(vector([1 for state in self.state_space]))
            return A.solve_right(vector([0 for state in self.state_space] + [1]))

Most of the above is glorified book keeping but here’s a quick example showing what the above does and how it can be used. First let’s create an instance of our problem with \(N=c_1=c_2=2\), \(5\mu_2=2p=\mu_1=1\) and \(\Lambda=5\).

sage: small_example = tandem_queue(2,2,2,5,1,1/5,1/2)
sage: small_example.Q
22 x 22 dense matrix over Rational Field (use the '.str()' method to see the entries)

We see that if we return \(Q\) we get a 22 by 22 matrix which if you recall the picture above corresponds to the 22 states in that picture.

We can see the states by just typing:

sage: small_example.state_space
[(0, 0),
 (1, 0),
 (2, 0),
 (3, 0),
 (4, 0),
 (0, 1),
 (1, 1),
 (2, 1),
 (3, 1),
 (4, 1),
 (0, 2),
 (1, 2),
 (2, 2),
 (3, 2),
 (4, 2),
 (0, 3),
 (1, 3),
 (2, 3),
 (3, 3),
 (0, 4),
 (1, 4),
 (2, 4)]

If you check carefully they all correspond to the states of the picture above.

Now what we would like to know is the probability of being in any given state. To do this we need to solve the matrix equation \(\pi Q = 0\) such that \(\sum \pi=1\).

This is done in Sage (for any matrix Q) using the following:

sage: A = transpose(Q).stack(vector([1 for state in self.state_space]))
sage: A.solve_right(vector([0 for state in self.state_space] + [1]))

There we build a matrix A with an added column of 1s and then solve the corresponding equation using solve_right (note that we transposed the matrix). If you look at the class definition this was all defined earlier so we can in fact just run:

sage: small_example.pi()
(974420487508335328592/55207801002325145206717477, 6717141852060739142960/55207801002325145206717477, 25263720112088475982400/55207801002325145206717477, 107693117265184715581000/55207801002325145206717477, 499825193288571759140000/55207801002325145206717477, 7567657557556535357400/55207801002325145206717477, 50835142813671411162000/55207801002325145206717477, 177836071295654602905000/55207801002325145206717477, 638540135036394350075000/55207801002325145206717477, 2305924001256099701875000/55207801002325145206717477, 26439192416069771765000/55207801002325145206717477, 185599515623092483825000/55207801002325145206717477, 700026867396942548625000/55207801002325145206717477, 2256398553097737112500000/55207801002325145206717477, 4700830318953618984375000/55207801002325145206717477, 61385774570987050093750/55207801002325145206717477, 444444998037114715312500/55207801002325145206717477, 3393156381219452445312500/55207801002325145206717477, 15476151589322058007812500/55207801002325145206717477, 41152314633066177343750/55207801002325145206717477, 352285141439825390625000/55207801002325145206717477, 1826827211896183837890625/4246753923255780400516729)

That’s not super helpful displayed like that (all the arithmetic is being done exactly) so here’s a quick plot of the probabilities:

sage: p = list_plot(small_example.pi(), axes_labels = ['State', 'Probability'])
sage: p

That plot could be made a lot prettier an informative (by for example using the names of the states as xticks) but it will do for now. We can see from there for example that the most probable state of our queue (with the parameters we picked) is to be in the last state (see list above) which is \((2,4)\).

Out of interest here’s a plot when we change $\Lambda=1/2$ (a tenth of what we did above):

We see that now the most probable state is the sixth state (Python/Sage indexing starts at 0), which corresponds to \((0,1)\).

Here’s an animated plot of the steady state distribution for a larger system as \(\Lambda\) increases, displayed in a more informative way (the two dimensions corresponding to \(i\) and \(j\)):

All of that is very nice and interesting but where things get very useful is when trying to calculate the mean time one would expect to wait in a queue.

Mean expected wait in the queue

If we consider all states \((i,j)\in S\) only a subset of them will actually imply a wait:

  • If there are less than \(c_1\) individuals in the first station then anyone who arrives has direct access to a server;
  • If there are more than \(N+c_1\) individuals in the first station then anyone who arrives will be lost to the system.

With a little bit of thought (recalling what the \(i\)s and \(j\)s represent) we see that the states that incur a wait are given by:

Now if we know the expected wait when arriving in any state \((i,j) \in S_A\) we can get the mean wait as:

where \(w(i,j)\) denotes the expected time spent in any given state \((i,j)\). We are in essence, summing over arrivals that will have a wait and dividing by the probability of an individual not being lost to the system, which happens when \(i + \max(j - c_2, 0) < c_1 + N\).

Obtaining \(c(i,j)\) is relatively simple. We consider the ‘almost’ same Markov chain as above (except that we ignore arrivals). In this instance a jump from one step to another will only occur if a service occurs at the first station (with rate \(\min(c_1-\max(j-c_2,0),i)\mu_1\)) or if a service at the second station (with rate \(\min(c_2, j)\mu_2\)).

So the mean time spent in state \((i,j)\) is the inverse of the total exit rate:

Using that notion we are in effect discretizing the ‘ignore arrivals’ Markov chain. Once a transition occurs we can obtain the probability of where the service occurs:

  • The probability of the service being at the first station:
  • The probability of the service being at the second station:

We can use all of the above to put together a nice recursive formula for the expected wait \(w(i,j)\) in terms of the expected wait of states that are in effect getting closer and closer to having no wait:

where \(A\), is the set of states with no wait: \(i+\max(j-c_2,0) < c_1\)

Using the recursive formula is actually very easy to implement in code (we use a dictionary to cache calculated values to not make sure we don’t waste any time).

Here is a reduced version of the methods that need to be added to above to get this to work in Sage (you need to add self.expected_wait_cache = {} to the __init__ method):

def _pi_dict(self):
        Obtain a dictionary which indexes the states.
        self.pi_list = self.pi()
        return {state:self.pi_list[index] for index, state in enumerate(self.state_space)}

    def p_service_1(self, state):
        Returns the discretized probability of a service occurring at first station
        if self.p == 1:
            return 1
        return min(self.c_1 - max(state[1]- self.c_2, 0), state[0]) * self.mu_1 / (min(self.c_1 - max(state[1]- self.c_2, 0), state[0]) * self.mu_1 + min(self.c_2, state[1]) * self.mu_2)

    def p_service_2(self, state):
        Returns the discretized probability of a service occurring at second station
        if self.p == 1:
            return 0
        return  min(self.c_2, state[1]) * self.mu_2 / (min(self.c_1 - max(state[1]- self.c_2, 0), state[0]) * self.mu_1 + min(self.c_2, state[1]) * self.mu_2)

    def mean_time_in_state(self, state):
        Returns the mean time in any given state before a transition occurs
        return  1 / (min(self.c_1 - max(state[1]- self.c_2, 0), state[0]) * self.mu_1 + min(self.c_2, state[1]) * self.mu_2)

    def expected_wait(self, state):
        Function that returns the expected time till absorption for a given state
        if state in self.expected_wait_cache:
            return self.expected_wait_cache[state]
        if state not in self.state_space:  # If state outside of boundary. (Might not need this after new conditions below)
            return 0
        if state[0] + max(state[1] - self.c_2, 0) < self.c_1:  # If absorbing state
            self.expected_wait_cache[state] = 0
            return 0
        self.expected_wait_cache[state] =  (self.mean_time_in_state(state) + self.p * self.p_service_1(state) * self.expected_wait((state[0] - 1, state[1])))
        if (state[0] - 1, state[1] + 1) in self.state_space:
            self.expected_wait_cache[state] += (1-self.p) * self.p_service_1(state) * self.expected_wait((state[0] - 1, state[1] + 1))
        if (state[0], state[1] - 1) in self.state_space:
            self.expected_wait_cache[state] += self.p_service_2(state) * self.expected_wait((state[0], state[1] - 1))
        return self.expected_wait_cache[state]

    def mean_expected_wait(self):
        Returns the mean wait
        self.pi_dict = self._pi_dict()
        accepting_states = [state for state in [s for s in self.state_space if s[0] + max(s[1] - self.c_2, 0) < self.c_1 + self.N]]
        prob_of_accepting = sum([self.pi_dict[state] for state in accepting_states])
        return sum([self.expected_wait(state) * self.pi_dict[state] for state in accepting_states]) / prob_of_accepting

Using all of the above we can get the expected wait for our system:

sage: small_example = Tandem_Queue(2,2,2,1/2,1,1/5,1/2)
sage: small_example.mean_expected_wait()
sage: n(_)

Below is a plot showing the effect on the mean wait as demand increases in the plot below for a large system:

What that plot shows is the calculated values (solid blue) line going through box plots of the simulated value. Perhaps in another blog post some day I’ll write about how to simulate the above but I think that’s probability sufficient for now.

If it’s of interest all of the above code can be downloaded here or at this gist.

September 19, 2014 12:00 AM

August 27, 2014

Sébastien Labbé

slabbe-0.1.spkg released

These is a summary of the functionalities present in slabbe-0.1 optional Sage package. It depends on version 6.3 of Sage because it uses RecursivelyEnumeratedSet code that was merged in 6.3. It contains modules on digital geometry, combinatorics on words and more.

Install the optional spkg (depends on sage-6.3):

sage -i

In each of the example below, you first have to import the module once and for all:

sage: from slabbe import *

To construct the image below, make sure to use tikz package so that view is able to compile tikz code when called:

sage: latex.add_to_preamble("\\usepackage{tikz}")
sage: latex.extra_preamble()

Draw the part of a discrete plane

sage: p = DiscretePlane([1,pi,7], 1+pi+7, mu=0)
sage: d = DiscreteTube([-5,5],[-5,5])
sage: I = p & d
sage: I
Intersection of the following objects:
Set of points x in ZZ^3 satisfying: 0 <= (1, pi, 7) . x + 0 < pi + 8
DiscreteTube: Preimage of [-5, 5] x [-5, 5] by a 2 by 3 matrix
sage: clip = d.clip()
sage: tikz = I.tikz(clip=clip)
sage: view(tikz, tightpage=True)

Draw the part of a discrete line

sage: L = DiscreteLine([-2,3], 5)
sage: b = DiscreteBox([0,10], [0,10])
sage: I = L & b
sage: I
Intersection of the following objects:
Set of points x in ZZ^2 satisfying: 0 <= (-2, 3) . x + 0 < 5
[0, 10] x [0, 10]
sage: I.plot()

Double square tiles

This module was developped for the article on the combinatorial properties of double square tiles written with Ariane Garon and Alexandre Blondin Massé [BGL2012]. The original version of the code was written with Alexandre.

sage: D = DoubleSquare((34,21,34,21))
sage: D
Double Square Tile
  w0 = 3032321232303010303230301012101030   w4 = 1210103010121232121012123230323212
  w1 = 323030103032321232303                w5 = 101212321210103010121
  w2 = 2321210121232303232123230301030323   w6 = 0103032303010121010301012123212101
  w3 = 212323032321210121232                w7 = 030101210103032303010
(|w0|, |w1|, |w2|, |w3|) = (34, 21, 34, 21)
(d0, d1, d2, d3)         = (42, 68, 42, 68)
(n0, n1, n2, n3)         = (0, 0, 0, 0)
sage: D.plot()
sage: D.extend(0).extend(1).plot()

We have shown that using two invertible operations (called SWAP and TRIM), every double square tiles can be reduced to the unit square:

sage: D.plot_reduction()

The reduction operations are:

sage: D.reduction()
['SWAP_1', 'TRIM_1', 'TRIM_3', 'SWAP_1', 'TRIM_1', 'TRIM_3', 'TRIM_0', 'TRIM_2']

The result of the reduction is the unit square:

sage: unit_square = D.apply(D.reduction())
sage: unit_square
Double Square Tile
  w0 =     w4 =
  w1 = 3   w5 = 1
  w2 =     w6 =
  w3 = 2   w7 = 0
(|w0|, |w1|, |w2|, |w3|) = (0, 1, 0, 1)
(d0, d1, d2, d3)         = (2, 0, 2, 0)
(n0, n1, n2, n3)         = (0, NaN, 0, NaN)
sage: unit_square.plot()

Since SWAP and TRIM are invertible operations, we can recover every double square from the unit square:

sage: E = unit_square.extend(2).extend(0).extend(3).extend(1).swap(1).extend(3).extend(1).swap(1)
sage: D == E

Christoffel graphs

This module was developped for the article on a d-dimensional extension of Christoffel Words written with Christophe Reutenauer [LR2014].

sage: G = ChristoffelGraph((6,10,15))
sage: G
Christoffel set of edges for normal vector v=(6, 10, 15)
sage: tikz = G.tikz_kernel()
sage: view(tikz, tightpage=True)

Bispecial extension types

This module was developped for the article on the factor complexity of infinite sequences genereated by substitutions written with Valérie Berthé [BL2014].

The extension type of an ordinary bispecial factor:

sage: L = [(1,3), (2,3), (3,1), (3,2), (3,3)]
sage: E = ExtensionType1to1(L, alphabet=(1,2,3))
sage: E
  E(w)   1   2   3
   1             X
   2             X
   3     X   X   X
 m(w)=0, ordinary
sage: E.is_ordinaire()

Creation of a strong-weak pair of bispecial words from a neutral not ordinaire word:

sage: p23 = WordMorphism({1:[1,2,3],2:[2,3],3:[3]})
sage: e = ExtensionType1to1([(1,2),(2,3),(3,1),(3,2),(3,3)], [1,2,3])
sage: e
  E(w)   1   2   3
   1         X
   2             X
   3     X   X   X
 m(w)=0, not ord.
sage: A,B = e.apply(p23)
sage: A
  E(3w)   1   2   3
    2         X   X
    3     X   X   X
 m(w)=1, not ord.
sage: B
  E(23w)   1   2   3
    1          X
    3              X
 m(w)=-1, not ord.

Fast Kolakoski word

This module was written for fun. It uses cython implementation inspired from the 10 lines of C code written by Dominique Bernardi and shared during Sage Days 28 in Orsay, France, in January 2011.

sage: K = KolakoskiWord()
sage: K
word: 1221121221221121122121121221121121221221...
sage: %time K[10^5]
CPU times: user 1.56 ms, sys: 7 µs, total: 1.57 ms
Wall time: 1.57 ms
sage: %time K[10^6]
CPU times: user 15.8 ms, sys: 30 µs, total: 15.8 ms
Wall time: 15.9 ms
sage: %time K[10^8]
CPU times: user 1.58 s, sys: 2.28 ms, total: 1.58 s
Wall time: 1.59 s
sage: %time K[10^9]
CPU times: user 15.8 s, sys: 12.4 ms, total: 15.9 s
Wall time: 15.9 s

This is much faster than the Python implementation available in Sage:

sage: K = words.KolakoskiWord()
sage: %time K[10^5]
CPU times: user 779 ms, sys: 25.9 ms, total: 805 ms
Wall time: 802 ms


[BGL2012]A. Blondin Massé, A. Garon, S. Labbé, Combinatorial properties of double square tiles, Theoretical Computer Science 502 (2013) 98-117. doi:10.1016/j.tcs.2012.10.040
[LR2014]Labbé, Sébastien, and Christophe Reutenauer. A d-dimensional Extension of Christoffel Words. arXiv:1404.4021 (April 15, 2014).
[BL2014]V. Berthé, S. Labbé, Factor Complexity of S-adic sequences generated by the Arnoux-Rauzy-Poincaré Algorithm. arXiv:1404.4189 (April, 2014).

by Sébastien Labbé at August 27, 2014 04:53 PM

Releasing slabbe, my own Sage package

Since two years I wrote thousands of line of private code for my own research. Each module having between 500 and 2000 lines of code. The code which is the more clean corresponds to code written in conjunction with research articles. People who know me know that I systematically put docstrings and doctests in my code to facilitate reuse of the code by myself, but also in the idea of sharing it and eventually making it public.

I did not made that code into Sage because it was not mature enough. Also, when I tried to make a complete module go into Sage (see #13069 and #13346), then the monstrous never evolving #12224 became a dependency of the first and the second was unofficially reviewed asking me to split it into smaller chunks to make the review process easier. I never did it because I spent already too much time on it (making a module 100% doctested takes time). Also, the module was corresponding to a published article and I wanted to leave it just like that.

Getting new modules into Sage is hard

In general, the introduction of a complete new module into Sage is hard especially for beginners. Here are two examples I feel responsible for: #10519 is 4 years old and counting, the author has a new work and responsabilities; in #12996, the author was decouraged by the amount of work given by the reviewers. There is a lot of things a beginner has to consider to obtain a positive review. And even for a more advanced developper, other difficulties arise. Indeed, a module introduces a lot of new functions and it may also introduce a lot of new bugs... and Sage developpers are sometimes reluctant to give it a positive review. And if it finally gets a positive review, it is not available easily to normal users of Sage until the next release of Sage.

Releasing my own Sage package

Still I felt the need around me to make my code public. But how? There are people (a few of course but I know there are) who are interested in reproducing computations and images done in my articles. This is why I came to the idea of releasing my own Sage package containing my public research code. This way both developpers and colleagues that are user of Sage but not developpers will be able to install and use my code. This will make people more aware if there is something useful in a module for them. And if one day, somebody tells me: "this should be in Sage", then I will be able to say : "I agree! Do you want to review it?".

Old style Sage package vs New sytle git Sage package

Then I had to chose between the old and the new style for Sage packages. I did not like the new style, because

  • I wanted the history of my package to be independant of the history of Sage,
  • I wanted it to be as easy to install as sage -i slabbe,
  • I wanted it to work on any recent enough version of Sage,
  • I wanted to be able to release a new version, give it to a colleague who could install it right away without changing its own Sage (i.e., updating the checksums).

Therefore, I choose the old style. I based my work on other optional Sage packages, namely the SageManifolds spkg and the ore_algebra spkg.

Content of the initial version

The initial version of the slabbe Sage package has modules concerning four topics: Digital geometry, Combinatorics on words, Combinatorics and Python class inheritance.


For installation or for release notes of the initial version of the spkg, consult the slabbe spkg section of the Sage page of this website.

by Sébastien Labbé at August 27, 2014 04:48 PM

William Stein

What is SageMathCloud: let's clear some things up

[PDF version of this blog post]
"You will have to close source and commercialize Sage. It's inevitable." -- Michael Monagan, cofounder of Maple, told me this in 2006.
SageMathCloud (SMC) is a website that I first launched in April 2013, through which you can use Sage and all other open source math software online, edit Latex documents, IPython notebooks, Sage worksheets, track your todo items, and many other types of documents. You can write, compile, and run code in most programming languages, and use a color command line terminal. There is realtime collaboration on everything through shared projects, terminals, etc. Each project comes with a default quota of 5GB disk space and 8GB of RAM.

SMC is fun to use, pretty to look at, frequently backs up your work in many ways, is fault tolerant, encourages collaboration, and provides a web-based way to use standard command-line tools.

The Relationship with the SageMath Software

The goal of the SageMath software project, which I founded in 2005, is to create a viable free open source alternative to Magma, Mathematica, Maple, and Matlab. SMC is not mathematics software -- instead, SMC is best viewed by analogy as a browser-based version of a Linux desktop environment like KDE or Gnome. The vast majority of the code we write for SMC involves text editor issues (problems similar to those confronted by Emacs or Vim), personal information management, support for editing LaTeX documents, terminals, file management, etc. There is almost no mathematics involved at all.

That said, the main software I use is Sage, so of course support for Sage is a primary focus. SMC is a software environment that is being optimized for its users, who are mostly college students and teachers who use Sage (or Python) in conjunction with their courses. A big motivation for the existence of SMC is to make Sage much more accessible, since growth of Sage has stagnated since 2011, with the number one show-stopper obstruction being the difficulty of students installing Sage.

Sage is Failing

Measured by the mission statement, Sage has overall failed. The core goal is to provide similar functionality to Magma (and the other Ma's) across the board, and the Sage development model and community has failed to do this across the board, since after 9 years, based on our current progress, we will never get there. There are numerous core areas of research mathematics that I'm personally familiar with (in arithmetic geometry), where Sage has barely moved in years and Sage does only a few percent of what Magma does. Unless there is a viable plan for the areas to all be systematically addressed in a reasonable timeframe, not just with arithmetic geometry in Magma, but with everything in Mathematica, Maple., etc, we are definitely failing at the main goal I have for the Sage math software project.

I have absolutely no doubt that money combined with good planning and management would make it possible to achieve our mission statement. I've seen this hundreds of times over at a small scale at Sage Days workshops during the last decade. And let's not forget that with very substantial funding, Linux now provides a viable free open source alternative to Microsoft Windows. Just providing Sage developers with travel expenses (and 0 salary) is enough to get a huge amount done, when possible. But all my attempts with foundations and other clients to get any significant funding, at even the level of 1% of the funding that Mathematica gets each year, has failed. For the life of the Sage project, we've never got more than maybe 0.1% of what Mathematica gets in revenue. It's just a fact that the mathematics community provides Mathematica $50+ million a year, enough to fund over 600 fulltime positions, and they won't provide enough to fund one single Sage developer fulltime.

But the Sage mission statement remains, and even if everybody else in the world gives up on it, I HAVE NOT. SMC is my last ditch strategy to provide resources and visibility so we can succeed at this goal and give the world a viable free open source alternative to the Ma's. I wish I were writing interesting mathematical software, but I'm not, because I'm sucking it up and playing the long game.

The Users of SMC

During the last academic year (e.g., April 2014) there were about 20K "monthly active users" (as defined by Google Analytics), 6K weekly active users, and usually around 300 simultaneous connected users. The summer months have been slower, due to less teaching.

Numerically most users are undergraduate students in courses, who are asked to use SMC in conjunction with a course. There's also quite a bit of usage of SMC by people doing research in mathematics, statistics, economics, etc. -- pretty much all computational sciences. Very roughly, people create Sage worksheets, IPython notebooks, and Latex documents in somewhat equal proportions.

What SMC runs on

Technically, SMC is a multi-datacenter web application without specific dependencies on particular cloud provider functionality. In particular, we use the Cassandra database, and custom backend services written in Node.js (about 15,000 lines of backend code). We also use Amazon's Route 53 service for geographically aware DNS. There are two racks containing dedicated computers on opposites sides of campus at University of Washington with 19 total machines, each with about 1TB SSD, 4TB+ HDD, and 96GB RAM. We also have dozens of VM's running at 2 Google data centers to the east.

A substantial fraction of the work in implementing SMC has been in designing and implementing (and reimplementing many times, in response to real usage) a robust replicated backend infrastructure for projects, with regular snapshots and automatic failover across data centers. As I write this, users have created 66677 projects; each project is a self-contained Linux account whose files are replicated across several data centers.

The Source Code of SMC

The underlying source of SMC, both the backend server and frontend client, is mostly written in CoffeeScript. The frontend (which is nearly 20,000 lines of code) is implemented using the "progressive refinement" approach to HTML5/CSS/Javascript web development. We do not use any Javascript single page app frameworks, though we make heavy use of Bootstrap3 and jQuery. All of the library dependencies of SMC, e.g., CodeMirror, Bootstrap, jQuery, etc. for SMC are licensed under very permissive BSD/MIT, etc. libraries. In particular, absolutely nothing in the Javascript software stack is GPL or AGPL licensed. The plan is that any SMC source code that will be open sourced will be released under the BSD license. Some of the SMC source code is not publicly available, and is owned by University of Washington. But other code, e.g., the realtime sync code, is already available.
Some of the functionality of SMC, for example Sage worksheets, communicate with a separate process via a TCP connection. That separate process is in some cases a GPL'd program such as Sage, R, or Octave, so the viral nature of the GPL does not apply to SMC. Also, of course the virtual machines are running the Linux operating system, which is mostly GPL licensed. (There is absolutely no AGPL-licensed code anywhere in the picture.)

Note that since none of the SMC server and client code links (even at an interpreter level) with any GPL'd software, that code can be legally distributed under any license (e.g., from BSD to commercial).
Also we plan to create a fully open source version of the Sage worksheet server part of SMC for inclusion with Sage. This is not our top priority, since there are several absolutely critical tasks that still must be finished first on SMC, e.g., basic course management.

The SMC Business Model

The University of Washington Center for Commercialization (C4C) has been very involved and supportive since the start of the projects. There are no financial investors or separate company; instead, funding comes from UW, some unspent grant funds that were about to expire, and a substantial Google "Academic Education Grant" ($60K). Our first customer is the "US Army Engineer Research and Development Center", which just started a support/license agreement to run their own SMC internally. We don't currently offer a SaaS product for sale yet -- the options for what can be sold by UW are constrained, since UW is a not-for-profit state university. Currently users receive enhancements to their projects (e.g., increased RAM or disk space) in exchange for explaining to me the interesting research or teaching they are doing with SMC.

The longterm plan is to start a separate for-profit company if we build a sufficient customer base. If this company is successful, it would also support fulltime development of Sage (e.g., via teaching buyouts for faculty, support of students, etc.), similar to how Magma (and Mathematica, etc.) development is funded.

In conclusion, in response to Michael Monagan, you are wrong. And you are right.

by William Stein ( at August 27, 2014 07:55 AM

You don't really think that Sage has failed, do you?

I just received an email from a postdoc in Europe, and very longtime contributor to the Sage project.  He's asking for a letter of recommendation, since he has to leave the world of mathematical software development (after a decade of training and experience), so that he can take a job at hedge fund.  He ends his request with the question:

> P.S. You don't _really_ think that Sage has failed, do you?

After almost exactly 10 years of working on the Sage project, I absolutely do think it has failed to accomplish the stated goal of the mission statement: "Create a viable free open source alternative to Magma, Maple, Mathematica and Matlab.".     When it was only a few years into the project, it was really hard to evaluate progress against such a lofty mission statement.  However, after 10 years, it's clear to me that not only have we not got there, we are not going to ever get there before I retire.   And that's definitely a failure.   

Here's a very rough quote I overheard at lunch today at Sage Days 61, from John Voight, who wrote much quaternion algebra code in Magma: "I'm making a list of what is missing from Sage that Magma has for working with quaternion algebras.  However, it's so incredibly daunting, that I don't want to put in everything.  I've been working on Magma's quaternion algebras for over 10 years, as have several other people.  It's truly daunting how much functionality Magma has compared to Sage."

The only possible way Sage will not fail at the stated mission is if I can get several million dollars a year in money to support developers to work fulltime on implementing interesting core mathematical algorithms.  This is something that Magma has had for over 20 years, and that Maple, Matlab, and Mathematica also have.   That I don't have such funding is probably why you are about to take a job at a hedge fund.    If I had the money, I would try to hire a few of the absolute best people (rather than a bunch of amateurs), people like you, Robert Bradshaw, etc. -- we know who is good. (And clearly I mean serious salaries, not grad student wages!)

So yes, I think the current approach to Sage has failed.    I am going to try another approach, namely SageMathCloud.  If it works, maybe the world will get a free open source alternative to Magma, Mathematica, etc.  Otherwise, maybe the world never ever will.      If you care like I do about having such a thing, and you're teaching course, or whatever, maybe try using SageMathCloud.   If enough people use SageMathCloud for college teaching, then maybe a business model will emerge, and Sage will get proper funding.   

by William Stein ( at August 27, 2014 07:52 AM

Vince Knight

A Sneak Preview of Game Theory in Sage (2/3): Matching Games

In my previous post here I described some of the Sage development that +James Campbell and I spent a lot of time this Summer working on. In that post I described some work that has subsequently been accepted and included in the latest release of Sage (here’s the latest changlog): code to calculate the Shapley value.

In this post I’ll talk about the second of 3 tickets that James and I worked on: looking at Matching games. This has not actually been reviewed yet so please do help us get this code in to Sage by taking a look at the ticket: 16331.

What is a matching game?

One of the best explanations of a matching game (also called the stable marriage problem) can be found in this video. That video really is awesome but it might be a bit long (it’s 25 minutes) so this very short video I threw together for a class I teach might be of interest (it is no where near as good as the previous one but it’s 3 minutes long).

Basically a matching game attempts to create links between two groups of people (referred to as suitors and reviewers) in such a way as no one wants to break their link:

In the above picture we see the preferences of the suitors and the reviewers. So \(c\), prefers \(B\) to \(A\), and \(A\) to \(C\).

Here is the actual definition of a stable matching that I give my students:

A matching game of size \(N\) is defined by two disjoint sets \(S\) and \(R\) or suitors and reviewers of size \(N\). Associated to each element of \(S\) and \(R\) is a preference list:

A matching \(M\) is a any bijection between \(S\) and \(R\). If \(s\in S\) and \(r\in R\) are matched by \(M\) we denote:

The above image defines a matching game, one possible matching could be given below:

It’s immediate to note however that \(B\) and \(c\) prefer each other to their current matching: so the above matching is unstable. In that example \((B,c)\) is called a ‘blocking pair’.

Luckily Gale and Shapley obtained an algorithm that guarantees a stable matching and this is what James and I put together in to Sage.

First, let’s define a matching game:

sage: suitr_pref = {'a': ('B', 'A', 'C'),
....:               'b': ('B', 'C', 'A'),
....:               'c': ('A', 'B', 'C')}
sage: reviewr_pref = {'A': ('a', 'b', 'c'),
....:                 'B': ('a', 'c', 'b'),
....:                 'C': ('b', 'c', 'a')}
sage: m = MatchingGame([suitr_pref, reviewr_pref])

You can see that python dictionaries are used for the functions \(f\) and \(g\) described above (the suitor preferences).

If you tab complete after typing m. you can see some of the methods and attributes associated with the MatchingGame class:

sage: m.
m.add_reviewer  m.bi_partite    m.db            m.dumps         m.rename        m.reviewers     m.solve         m.version
m.add_suitor    m.category      m.dump          m.plot          m.reset_name          m.suitors

I won’t go in to much of the details of that year but you can get some help on anyone of those by typing ? after one of them (below you can see some of the output):

sage: m.solve?

Type:            instancemethod
File:            /Users/vince/sage/local/lib/python2.7/site-packages/sage/game_theory/
Definition:      m.solve(self, invert=False)
   Computes a stable matching for the game using the Gale-Shapley


Let’s give that method a spin (as you can see it’ll use the Gale-Shapley algorithm).

sage: m.solve()
{'C': ['b'], 'a': ['B'], 'b': ['C'], 'A': ['c'], 'c': ['A'], 'B': ['a']}

We see that a matching has been obtained. You can see the corresponding matching here:

Another nice method that we implemented is to use the awesome graph theory stuff that’s in Sage so you can obtain the corresponding bi-partite graph:

sage: p = m.bi_partite()
Bipartite graph on 6 vertices

You can see the corresponding plot here:

All of the above has not been reviewed yet so if you do have any comments they’d be very gratefully received. If you actually went over to trac and took a look at it there that would be great but otherwise just commenting here would be awesome.

This is the second in a series of 3 posts that I’ll get around to writing, in the next one I’ll cover ticket 16333: Normal Form Game. This is the biggest contribution by James as it involved interfacing with two other packages and also coding up a bespoke support enumeration algorithm.

August 27, 2014 12:00 AM

August 23, 2014

Nikhil Peter

GSoC: An End, And A New Beginning

Well, it’s officially done. As per my proposal, the project has been officially completed. It’s been a rollercoaster ride of new experiences, a ton of code(by my count its somewhere around 20k lines or so, but GitHub shows a much larger number) and some unforgettable memories. The app is nowhere near perfect, however, and I […]

by hav3n at August 23, 2014 09:15 AM

August 22, 2014

Simon Spicer

GSoC: Wrapping Up

Today marks the last day of my Google Summer of Code 2014 project. Evaluations are due midday Friday PDT, and code submissions for successfully passed projects start soon thereafter. The end result of my project can be found at Sage Trac Ticket 16773. In total it's just over 2000 lines of Python and Cython code, to be added to the next major Sage release (6.4) if/when it gets a final thumbs-up review.

When I write just the number of lines of code it doesn't sound like very much at all - I'm sure there are GSoC projects this year that produced at least 10k lines of code! However, given that what I wrote is complex mathematical code that needed a) significant optimization, and b) to be be mathematically sound in the first place, I'd say that isn't too bad. Especially since the code does what the project description aimed for it to do: compute analytic ranks of rational elliptic curves very quickly with a very low rate of failure. Hopefully this functionality can and will be used to produce numerical data for number theory research in the years to come.

The Google Summer of Code has been a thoroughly rewarding experience for me. It's a great way to sharpen one's coding skills and get paid in the process. I recommend it for any university student who's looking to go into any industry that requires programming skills; I'd apply to do it again next year if I wasn't planning to graduate then!

by Simon Spicer ( at August 22, 2014 10:04 AM

August 19, 2014

Amit Jamadagni


Hello everyone,
This week we have been working on editing the code to reach the standards along side running the tests. A decent amount of time has been spent on documentation. Plot methods and 3d co ordinates seem to be taking a longer time and even Miguel is giving this a good thought, so as of now these remain still in the thinking phase. In the meantime Miguel has shared some great work regarding the implementation of HOMFLY polynomial, stronger invariant to distinguish links. I am enjoying myself going through it as it shows how it is related to few other things which are of interest to me. Here is the link for it

So as a part of it I have implemented the dowker notation which is nothing but the (Ux, Ox) that is under-cross, over-cross at a particular crossing. This was a straight forward implementation as we had everything already present the PD code and the orientation of the crossings. In addition to this we have been working on the code to make it better and more cleaner. We have dropped the support to take in key words, now it is just that we directly give in the input for the link (the user need not mention whether it is a braid, PD_Code, or oriented gauss code) we have made it possible to detect from the way the user inputs what kind of input it is. There are few minor issues with code refactoring which we have been working on. Here is the link for the latest code :

by esornep at August 19, 2014 08:31 PM

August 18, 2014

Harald Schilly

New combinatorial designs in Sage - by Nathann Cohen

This is a guest post by Nathann Cohen. 

New combinatorial designs in Sage

Below, these graphs are a decomposition of a $K_{13}$ (i.e. the complete graph on 13 points) into copies of $K_4$. Pick two vertices you like: they appear exactly once together in one of the $K_4$.

The second graph shows a decomposition of a $K_{4,4,4}$ (i.e. the complete multipartite graph on $4\times 3$ points) into copies of $K_3$. Pick two vertices you like from different groups: they appear exactly once together in one of the $K_4$.

Sage has gotten quite good at building such decompositions (a specific kind of combinatorial designs) when they exist. This post is about them.

The first object belongs to a family called Balanced Incomplete Block Designs (or $(n,k)$-BIBD), which are defined as "a collection $\mathcal S$ of sets, all of them with size $k$ (here $k=4$), such that any pair of points of a set $X$ with $|X|=n$ (here $n=13$) appears in exactly one set of $\mathcal S$".

The second belongs to the family of Transversal Designs (or $TD(k,n)$) which have a similar definition: consider a set $X$ containing $k$ groups (here $k=3$) of $n$ vertices (here $n=4$). A collection $\mathcal S$ of sets, each of which contains one point from each group, is a $TD(k,n)$ if any two points from different groups appear together in exactly one set of $\mathcal S$.

The main problem with combinatorial designs is to know where they exist. And that is not obvious. Sage does what it can on about that:
  • If you want it to build a $(14,4)$-BIBD, it will tell you that none exists.
  • If you want it to build a $(16,4)$-BIBD, it will tell you that one exists.
  • If you want it to build a $(51,6)$-BIBD, it will tell you that it just not know if there is one (and nobody knows better at the moment)
Examples here:

sage: designs.balanced_incomplete_block_design(14,4,existence=True)
sage: designs.balanced_incomplete_block_design(16,4,existence=True)
sage: designs.balanced_incomplete_block_design(51,6,existence=True)

For a developer (and design lover), the game consists in teaching Sage how to build all combinatorial designs that appear in some research paper. For BIBD as well as for Transversal Designs, on which a LOT of sweat was spent these last months.

For Transversal designs the game is a bit different, as we know that a $TD(k-1,n)$ exists whenever a $TD(k,n)$ exists. Thus, the game consists in finding the largest integer $k_n$ such that a $TD(k_n,n)$ exists. This game is hardly new, and hardly straightforward: In the Handbook of Combinatorial Designs, one can find the table of such $k$ up to $n=10000$ (see here).

The good thing about Sage is that it does not just claim that such a design exists: it also builds it, and there is no better existence proof than that (it is very very quick to check that a combinatorial design is valid). The other good thing is that there is no common database for such data (the Handbook is not updated/printed every night), and that by teaching Sage all new designs found by researchers we build such a database. And it already contains designs that were not known when the Handbook was printed.

Finally, the other other good thing about Sage is that it will soon be able to tell you where those designs come from. Indeed, the most powerful results in the field of Transversal Designs are of the shape "If there exists a $TD(k_1,n_1)$, and a $TD(k_2,n_2)$, ..., and a $TD(k_c,n_c)$, then you can combine them all to obtain a $TD(k,n)$ with $k=k(k_1,...,k_c)$ and $n=n(n_1,...,n_c)$". And it is never very clear how to inverse these functions: if you want to build a $TD(k,n)$, which integers $k_1,...,k_c,n_1,...,n_c$ should you pick ?

Sage knows. It must know it, in order to build these designs anyway. And you can find that data inside. And soon, we will teach it to give you the bibliographical references of the papers in which you can find the right construction to produce the $TD(k,n)$ that you want. And we will provide the right parameters. And the world will be at peace.

A couple of things before we part:
  • Transversal Designs (TD), Orthogonal Arrays (OA), and Mutually Orthogonal Latin Squares (MOLS) are all equivalent objects.
  • We write a LOT of Transversal Designs code these days, so expect all this to improve very fast.
  • You can learn what Sage knows of combinatorial designs right here.
Finally, there are far too many combinatorial designs for one man to learn. If you love combinatorial designs, come join us: Vincent Delecroix, you, and I have code to write together. And if you know a related mathematical results that Sage ignores, come tell us: we could not have gone so far without the mathematical knowledge of Julian Abel. And Sage does not know everything yet.

Have fuuuuuuuuuuuuuuuuuuun !


by Harald Schilly ( at August 18, 2014 09:32 PM

August 15, 2014

Simon Spicer

How big should Delta be?

Let's take a look at the central formula in my GSoC rank estimation code. Let $E$ be a rational elliptic curve with analytic rank $r$. Then
$$ r < \sum_{\gamma} \text{sinc}^2(\Delta\gamma) =  \frac{1}{\pi\Delta} \left[ C_0 + \frac{1}{2\pi\Delta}\left(\frac{\pi^2}{6}-\text{Li}_2\left(e^{-2\pi\Delta}\right) \right) + \sum_{n=1}^{e^{2\pi\Delta}} c_n\left(1 - \frac{\log n}{2\pi\Delta}\right) \right] $$

  • $\gamma$ ranges over the nontrivial zeros of the $L$-function attached to $E$
  • $\Delta$ is a positive parameter
  • $C_0 = -\eta + \log\left(\frac{\sqrt{N}}{2\pi}\right)$; $\eta$ is the Euler-Mascheroni constant $=0.5772\ldots$ and $N$ is the conductor of $E$
  • $\text{Li}_2(x)$ is the dilogarithm function, defined by $\text{Li}_2(x) = \sum_{k=1}^{\infty} \frac{x^k}{k^2}$
  • $c_n$ is the $n$th coefficient of the logarithmic derivative of the $L$-function of $E$.
The thing I want to look at in this post is that parameter $\Delta$. The larger you make it, the closer the sum on the left hand side over the zeros is to the analytic rank, so when trying to determine the rank of $E$ we want to pick as large a $\Delta$ as we can. However, the bigger this parameter is the more terms we have to compute in the sum over the $c_n$ on the right hand side; moreover the number of terms - and thus the total computation time - scales exponentially with $\Delta$. This severely constrains how big we can make $\Delta$; generally a value of $\Delta=2$ may take a second or two for a single curve on SageMathCloud, while $\Delta=3$ may take upwards of an hour. For the average rank project I'm working on I ran the code on 12 million curves using $\Delta=1.3$; the total computation time was about 4 days on SMC.

However, it should be clear that using too large a $\Delta$ is overkill: if you run the code on a curve with $\Delta=1$ and get a bound of zero out, you know that curve's rank exactly zero (since it's at most zero, and rank is a non-negative integer). Thus using larger $\Delta$ values on that curve will do nothing except provide you the same bound while taking much longer to do so.

This begs the question: just how big of a $\Delta$ value is good enough? Can we, given some data defining an elliptic curve, decide a priori what size $\Delta$ to use so that a) the computation returns a bound that is likely to be the true of the curve, and b) it will do so in as little time as possible?

The relevant invariant to look at here is conductor $N$ of the elliptic curve; go back to the formula above and you'll see that the zero sum includes a term which is $O\left(\frac{\log(N)}{2\pi\Delta}\right)$ (coming from the $\frac{1}{\pi \Delta} C_0$ term). This means that size of the returned estimate will scale with $\log(N)$: for a given $\Delta$, the bound returned on a curve with 10-digit conductor will be about double that which is returned for a 5-digit conductor curve, for example. However, we can compensate this by increasing $\Delta$ accordingly.

To put it all more concretely we can pose the following questions:
  • Given an elliptic curve $E$ with conductor $N$, how large does $\Delta$ need to be in terms of $N$ so that the returned bound is guaranteed to be the true analytic rank of the curve?
  • Given a conductor size $N$ and a proportionality constant $P \in [0,1]$, how big does $\Delta$ have to be in terms of $N$ and $P$ so that at least $P\cdot 100$ percent of all elliptic curves with conductor of size about $N$ will, when the rank estimation code is run on them with that $\Delta$ value, yield returned bounds that are equal to their true analytic rank?
[Note: in both of the above questions we are making the implicit assumption that the returned rank bound is monotonically decreasing for increasing $\Delta$. This is not necessarily the case: the function $y = \text{sinc}^2(x)$ is not a decreasing function in $x$. However, in practice any upwards fluctuation we see in the zero sum is small enough to be eliminated when we take the floor function to get an integer rank bound.]


The first question is easier to phrase, but more difficult to answer, so we will defer it for now. To answer the second question, it is useful mention what we know about the distribution and density of nontrivial zeros of an elliptic curve $L$-function.

Using some complex analysis we can show that, for the $L$-function of an elliptic curve with conductor $N$, the expected number of zeros in the critical strip with imaginary part at most $T$, is $O(T\log N+ T\log T)$. That is, expected zero density has two distinct components: a part that scales linearly with log of the conductor, and a part that doesn't scale with conductor (but does scale slightly faster than linearly with how far out you go).

Consider the following: if we let 
$$\Delta(N) = \frac{C_0}{\pi} = \frac{1}{\pi}\left(-\eta+\log\left(\frac{\sqrt{N}}{2\pi}\right)\right)$$
then the first term in the right hand side of the zero sum formula is precisely 1 - this is the term that comes from the $\log N$ part of the zero density. The next term - the one involving $\frac{\pi^2}{6}-\text{Li}_2\left(e^{-2\pi\Delta}\right)$ - is the term that comes from the part independent of $N$; because the right hand side is divided by $\Delta$ it therefore goes to zero as the curve's conductor increases. The third term contains the $c_n$ coefficients which (per Sato-Tate) will be positive half the time and negative half the time, so the entire sum could be positive or negative; we therefore expect its contribution to be small on average when we look at large number of elliptic curves.

It thus stands to reason that for this value of Delta, and when the conductor $N$ is sufficiently large, the zero sum will be about 1, plus or minus a smallish amount coming from the $c_n$ sum. This argument is by no means rigorous, but we might therefore expect the zero sum to be within 2 of the actual analytic rank most of the time. Couple that with knowledge of the root number and you get a rank upper bound out which is equal to the actual analytic rank in all but a few pathological cases.

Empirical evidence bears this out. I ran my rank estimation code with this choice of $\Delta$ scaling on the whole Cremona database, which contains all elliptic curves up to conductor 350000:
The proportion of curves up to conductor $N$ for which the computed rank bound is strictly greater than rank. The $x$-axis denotes conductor; the $y$-axis is the proportion of all curves up to that conductor for which the returned rank bound was not equal to the true rank (assuming BSD and GRH as always).
As you can see, the percentage of curves for which the rank bound is strictly greater than rank holds fairly constant at about 0.25%. That's minuscule: what this is saying is that if you type in a random Weierstrass equation, there is only about a 1 in 1000 chance that the rank bounding code with $\Delta = \frac{C_0}{\pi}$ will return a value that isn't the actual analytic rank. Moreover, the code runs considerably faster than traditional analytic rank techniques, so if you wanted to, for example, compute the ranks of a database of millions of elliptic curves, this would be a good first port of call.

Of course, this $\Delta$ scaling approach is by no means problem-free. Some more detailed analysis will show that that as stated above, the runtime of the code will actually be $O(N)$ (omitting log factors), i.e. asymptotic scaling is actually worse than traditional analytic rank methods, which rely on evaluating the $L$-function directly and thus are $O(\sqrt{N})$. It's just that with this code we have some very small constants sitting in front, so the crossover point is at large enough conductor values that neither method is feasible anyway. 

This choice of $\Delta$ scaling works for conductor ranges up to about $10^9$; that corresponds to $\Delta \approx 2.5$, which will give you a total runtime of about 10-20 seconds for a single curve on SMC. Increase the conductor by a factor of 10 and your runtime will also go up tenfold.

For curves of larger conductor, instead of setting $\Delta = \frac{C_0}{\pi}$ we can choose to set $\Delta$ to be $\alpha\cdot \frac{C_0}{\pi}$ for any $\alpha \in [0,1]$; the resulting asymptotic runtime will then be $O(N^{\alpha})$, at the expense of having a reduced proportion of elliptic curves where rank bound is equal to true rank.


When we use $\Delta = \frac{C_0}{\pi}$, the curves for which the returned rank estimate is strictly larger than the true rank are precisely those which have unusually low-lying zeros. For example, the rank 0 curve with Cremona label 256944c1, has a zero with imaginary part at 0.0256 (see here for a plot), compared to an expected value of 0.824. Using $\Delta = \frac{C_0}{\pi}$ on this curve means $\Delta \approx 1.214$; if we compute the corresponding zero sum with this value of $\Delta$ we get a value of 2.07803. The smallest value of $\Delta$ for which we get a zero sum value of less than 2 is empirically about 2.813; at this point taking the floor and invoking parity tells us that the curve's rank is zero.

The above example demonstrates that if we want to guarantee that the returned rank bound is the true analytic rank, we are forced to increase the size of $\Delta$ to something larger than $\frac{C_0}{\pi}$. Do we need to increase $\Delta$ by a fixed value independent of $N$? Do we need to increase it by some constant factor? Or does it need to scale faster than $\log N$? These are hard questions to answer; it comes down to determining how close to the central point the lowest nontrivial zero can be as a function of the conductor $N$ (or some other invariants of $E$), which in turn is intimately related to estimating the size of the leading coefficient of the $L$-series of $E$ at the central point. This is already the topic of a previous post: it is a question that I hope to make progress in answering in my PhD dissertation.

by Simon Spicer ( at August 15, 2014 10:44 AM

August 14, 2014

Simon Spicer

Things are Better in Parallel

A recent improvement I've implemented in my GSoC code is to allow for parallelized computation. The world has rapidly moved to multi-core as a default, so it makes sense to write code that can exploit this. And it turns out that the zero sum rank estimation method that I've been working on can be parallelized in a very natural way.


But first: how does one compute in parallel in Sage? Suppose I have written a function in a Sage environment (e.g. a SageMathCloud worksheet, .sage file, Sage console etc.) which takes in some input and returns some other input. The simple example f below takes in a number n and returns the square of that number.

sage: def f(n):
....:     return n*n
sage: f(2),f(3),f(5)
(4, 9, 25)

This is a fairly straightforward beast; put in a value, get a value out. But what if we have some computation that requires evaluating that function on a large number of inputs? For example, say we want to compute the sum of the first 10 million squares. If you only have one processor to tap, then you're limited to calling f over and over again in series:

sage: def g(N):
....:     y = 0
....:     for n in range(N+1):
....:         y += f(n)
....:     return y
sage: %time g(10000000)
CPU times: user 17.5 s, sys: 214 ms, total: 17.7 s
Wall time: 17.6 s

In this example you could of course invoke the formula for the sum of the first $n$ squares and just write down the answer without having to add anything up, but in general you won't be so lucky. You can optimize the heck out of f, but in general when you're limited to a single processor you're confined to iterating over all the values you need to consider sequentially .

However, if you have 2 processors available one could try write code that splits the work into two roughly equal parts that can run relatively independently. For example, for our function we could compute the sum of all the even squares up to a given bound in one process and the sum of all the odd squares in another, and then add the two results together to get the sum of all square up to our bound.

Sage has a readily available mechanism to do exactly this: the @parallel decorator. To enable parallel computation in your function, put @parallel above your function definition (the decorator can take some parameters; below ncpus=2 tells it that we want to use 2 processors). Note that we also have to modify the function: now it no longer only takes the bound up to which we must add squares, but also a flag indicating whether we should consider even or odd squares.

sage: @parallel(ncpus=2)
....: def h(N,parity):
....:     y = 0
....:     if parity=="even":
....:         nums = range(0,N+1,2)
....:     elif parity=="odd":
....:         nums = range(1,N+1,2)
....:     for n in nums:
....:         y += f(n)
....:     return y

Instead of calling h with its standard sequence of parameters, we can pass it a list of tuples, where each tuple is a valid sequence of inputs. Sage then sends each tuple of inputs off to an available processor and evaluates the function on them there. What's returned is a generator object that can iterate over all the outputs; we can always see the output directly by calling list() on this returned generator:

sage: for tup in list(h([(1000,"even"),(1000,"odd")])):
....:     print(tup)
(((1000, 'even'), {}), 167167000)
(((1000, 'odd'), {}), 166666500)

Note that the tuple of inputs is also returned. Because we're doing things in parallel, we need to know which output corresponds to which input, especially since processes may finish at different times and return order is not necessarily the same as the order of the input list.

Finally, we can write a wrapper function which calls our parallelized function and combines the returned results:

sage: def i(N):
....:     y = 0
....:     for output in h([(N,"even"),(N,"odd")]):
....:         y += output[1]
....:     return y
sage: %time i(10000000)
CPU times: user 1.76 ms, sys: 33.2 ms, total: 35 ms
Wall time: 9.26 s

Note that i(10000000) produces the same output at g(10000000) but in about half the time, as the work is split between two processes instead of one. This is the basic gist of parallel computation: write code that can be partitioned into parts that can operate (relatively) independently; run those parts on different processors simultaneously; and then collect returned outputs and combine to produce desired result.


Let's take a look at the rank estimating zero formula again. Let $E$ be a rational elliptic curve with analytic rank $r$. Then

r < \sum_{\gamma} \text{sinc}^2(\Delta\gamma) &=  \frac{1}{\pi \Delta}\left(-\eta+\log\left(\frac{\sqrt{N}}{2\pi}\right)\right) \\
&+ \frac{1}{2\pi^2\Delta^2}\left(\frac{\pi^2}{6}-\text{Li}_2\left(e^{-2\pi\Delta}\right) \right) \\
&+ \frac{1}{\pi \Delta}\sum_{\substack{n = p^k \\ n < e^{2\pi\Delta}}} c_n\left(1 - \frac{\log n}{2\pi\Delta}\right)
  • $\gamma$ ranges over the nontrivial zeros of the $L$-function attached to $E$
  • $\Delta$ is a positive parameter
  • $\eta$ is the Euler-Mascheroni constant $=0.5772\ldots$
  • $N$ is the conductor of $E$
  • $\text{Li}_2(x)$ is the dilogarithm function, defined by $\text{Li}_2(x) = \sum_{k=1}^{\infty} \frac{x^k}{k^2}$
  • $c_n$ is the $n$th coefficient of the logarithmic derivative of the $L$-function of $E$, which is zero when $n$ is not a perfect prime power.
The right hand side of the sum, which is what we actually compute, can be broken up into three parts: the first term involving the curve's conductor $N$; the second term involving the dilogarithm function $Li_2(x)$; and the sum over prime powers. The first two parts are quick to compute: evaluating them can basically be done in constant time regardless of the magnitude of $N$ or $\Delta$.

It is therefore not worth considering parallelizing these two components, since the prime power sum dominates computation time for all but the smallest $\Delta$ values. Instead, what I've done is rewritten the zero sum code so that the prime power sum can be evaluated using multiple cores.

As mentioned in this post, we can turn this sum into one indexed by the primes (and not the prime powers); this actually makes parallelization quite straightforward. Recall that all primes except $2$ are odd, and all primes except $2$ and $3$ are either $1$ or $5$ remainder $6$. One can scale this up: given a list of small primes $[p_1,p_2,\ldots,p_n]$, all other primes fall into one of a relatively small number of residue classes modulo $p_1 p_2\cdots p_n$. For example, all primes beyond $2$, $3$, $5$ and $7$ have one of the following 48 remainders when you divide them by $210 = 2\cdot 3\cdot 5 \cdot 7$:
&1, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79,\\
&83, 89, 97, 101, 103, 107, 109, 113, 121, 127, 131, 137, 139, 143, 149, \\
&151, 157, 163, 167, 169, 173, 179, 181, 187, 191, 193, 197, 199, 209,
and no other.

If we had 48 processors available. the natural thing to do would be to get each of them to iterate over all integers in a particular residue class up to $e^{2\pi\Delta}$, evaluating the summand whenever that integer is prime, and returning the sum thereof. For example, if the bound was 1 million, then processor 1 would compute and return $\sum_{n \equiv 1 (\text{mod } 210)}^{1000000} c_n\left(1 - \frac{\log n}{2\pi\Delta}\right)$. Processor 2 would do the same with all integers that are $11$ modulo $210$, etcetera.

In reality, we have to figure out a) how many processors are available, and b) partition the work relatively equally among those processors. Thankfully sage.parallel.ncpus.ncpus() succinctly addresses the former, and the latter is achieved by splitting the residue classes into $n$ chunks of approximately equal size (where $n$ is the number of available CPUs) and then getting a given processor to evaluate the sum over those residues in a single chunk.

Here is the method I wrote that computes the $\text{sinc}^2$ zero sum with (the option of) multiple processors:

Note that I've defined a subfunction to compute the prime sum over a given subset of residue classes; this is the part that is parallelized. Obtaining the residue chunks and computing the actual summand at each prime are both farmed out to external methods.

Let's see some timings. The machine I'm running Sage on has 12 available processors:

sage: sage.parallel.ncpus.ncpus()
sage: E = EllipticCurve([12838,-51298])
sage: Z = LFunctionZeroSum(E)
sage: %time Z._zerosum_sincsquared_fast(Delta=2.6)
CPU times: user 36.7 s, sys: 62.9 ms, total: 36.8 s
Wall time: 36.8 s
sage: %time Z._zerosum_sincsquared_parallel(Delta=2.6)
CPU times: user 7.87 ms, sys: 133 ms, total: 141 ms
Wall time: 4.06 s

Same answer in a ninth of the time! Note that the parallelized method has some extra overhead, so even with 12 processors we're unlikely to get a full factor of 12 speedup. Nevertheless, the speed boost will allow us to run the zero sum code with larger $\Delta$ values, allowing us to investigate elliptic curves with even larger conductors.

by Simon Spicer ( at August 14, 2014 01:26 PM

August 12, 2014

Amit Jamadagni


Hello everyone,
There has been a delay this time. This week I have worked on cleaning the code and I am trying to get to the standard required as well trying to clear the bugs which come along. I have been studying the theory on plotting (not really the theory but the components required for implementation). The inspiration mainly comes from Graph theory and network flows. I have been trying to understand what is going in the Spherogram package where they try to generate the data required for the plot and use plink to get the diagram using this data. The focus has been on to achieve the data they have got and use it via the sage methods rather than directing it to plink. The code is mainly present in the file and they have used orthogonal representations to generate the plot. The one thing which is basic and still not clear is what are they considering as vertices. While I was working on this, I and Miguel had a meeting on Monday and we had some issues to be resolved before we moved forward. I have resolved the issues which were arising in the seifert_to_braid method and have added the __repr__ method. The issue with seifert_to_braid was with the ordering of the seifert_circles. Previously the idea was to find the intersection of the seifert circles with others (as initially we had used the idea of consecutive numbers for naming the edges) by adding a one and subtracting and checking on the intersection. But I over looked this part when I worked on the extension to links and it stuck me here, that I had to even edit this. Now I have edited this part and now we check for the intersection of seifert circles with the pd code, remove all the crossings which share the seifert circle numbering and at the same time select one of the crossings which share the seifert circle and construct a variable which had only parts other than the seifert circle. Now we use this to find the intersection with the other seifert circles and so on and so forth, in this way we order the seifert circles. The __repr__ method has been straight forward. I have removed the method smallest equivalent and replaced the name of the link_number with ncomponents.

Moving on I worked with Jones polynomial, I had this doubt whether the smoothing would depend on the orientation of the crossing. The answer was that it does not and that made my things easy as I had to just refine the earlier code which took into consideration the orientation of the crossing. So now I can say atleast the Kaufmann’s polynomial works fine, to get to the Jones polynomial, I would have to replace the polynomial variable by t^(1/4) for which I have been searching around with no answers. I would be working on the plot methods this week and try to see if I can get something out.

This is the last week before the pencils down, however I will try to continue the work and blog accordingly. I have learnt a lot during these two months, it has been a fascinating journey and I would be continuing my work post GSoC on making things better. I hope you have enjoyed reading the posts (sometimes I have not moved into the details, because I wanted to keep it simple). I will be continuing to blog my posts and hopefully the work till now can get me across the final evaluation.

I would like to leave you with some examples of Jones polynomial (without the t^(1/4) substitution) and also the work till now.

sage: L = link.Link(B([-1, 2, -1, 2, -3, -2, 1, -2, -3]))
sage: L.alexander_polynomial()
-2*t^4 + 5*t^3 - 2*t^2
sage: L.jones_polynomial()
t^24 - t^20 + t^16 - 2*t^12 + t^8 - t^4 + 2
sage: L = link.Link(B([-1, -1, -2, 1, 3, -2, 3]))
sage: L.alexander_polynomial()
-2*t^3 + 5*t^2 - 2*t
sage: L.jones_polynomial()
t^16 - t^12 + t^8 - 2*t^4 + 2 - t^-4 + t^-8

The white-head link:

sage: l5 = [[1,8,2,7],[8,4,9,5],[3,9,4,10],[10,1,7,6],[5,3,6,2]]
sage: L = link.Link(PD_code = l5)
sage: L.jones_polynomial()
t^14 - 2*t^10 + t^6 - 2*t^2 + t^-2 - t^-6

The Right Trefoil

sage: L = link.Link(B([1, 1, 1]))
sage: L.jones_polynomial()
t^-4 + t^-12 - t^-16

The Left Trefoil

sage: L = link.Link(B([-1, -1, -1]))
sage: L.jones_polynomial()
-t^16 + t^12 + t^4

The Hopf Link

sage: l1 = [[1,4,2,3],[4,1,3,2]]
sage: L = link.Link(PD_code = l1)
sage: L.jones_polynomial()
-t^10 - t^2

And here is the link for the latest code :

Thanks for reading through.

PS: I mentioned in the previous blog that 3d input was the priority, but as 3d inputs and plot were more related, I choose to work on the latter. 3d inputs require projecting the lines into 2d but I have not found any methods around which would make things easier. Plot looks to be more achievable. Yet I will work on the 3d inputs but for now the plot seems more interesting.

by esornep at August 12, 2014 09:17 PM

August 04, 2014

Amit Jamadagni


Hello everyone,
This week we mainly focused on jones polynomial. The orientation method had few edits and we are expecting it to be fine. I started out with the implementation of jones polynomial, the previous version where the trip matrix was used was restricted to knots. That method mainly used the oriented gauss code but in this implementation we have used the PD code so that it would work for links as well. I have tried to comment in the code the method I have followed, the outline of the procedure has been taken from

There are few things which remain to be implemented, mainly like accepting the 3d coordinates and HOMFLY polynomial. Few methods have to be renamed and a I guess there is work remaining in the documentation part of the code. This week the focus would be on accepting 3d coordinates and converting it to PD code that would allow the conversion to braid or oriented gauss code. So the target for the week would be
1. 3d coordinates (the priority)
2. Renaming the methods
3.  Storing (this has been pending for a good amount of time, the idea is once something related is computed we store it, this is being achieved by creating an object which is not an efficient way of doing things).

Hopefully I can get the above things working. That’s it from me and thanks for reading through.

Here is the pull request for the week,

by esornep at August 04, 2014 08:33 PM

August 01, 2014

Vince Knight

A Sneak Preview of Game Theory in Sage (1/3): Cooperative Game Theory

+James Campbell and I spent a lot of time this Summer working on implementing some Game Theory in to Sage.

In this post I’ll (very briefly) describe some of the process involved in contributing to Sage and give a sneak peek at some of the Game Theory code that will (hopefully) be in Sage soon.

This has been a really great experience as it was my first time really contributing to an open source project.

It involved a lot of coding, documentation writing and also being very thankful for all the help we got from the Sage community (a huge thanks to Karl).

It all starts with opening ‘tickets’ on the trac server. We opened 3 tickets:

  • 16331: Build capacity to solve matching games in to Sage.
  • 16332: Build capacity to calculate Shapley value of cooperative games.
  • 16333: Build class for normal form games as well as ability to obtain Nash equilibria

After doing that James and I quickly realised that we needed to have our own common repository for Game Theory development so that’s up on github here.

In this post I’ll talk briefly about some of the stuff that you will be able to do thanks to ticket 16332. This particular ticket has in fact been reviewed and given the all clear so should hopefully make its way in to a future release of Sage! (Which is very exciting indeed!).

If you would like to follow along with some of the stuff written here you’ll need to grab the 16332 branch from the github repository: or go directly to the trac server and grab the 16332 ticket.

What is a cooperative game?

Here is the definition that I give my students:

A characteristic function game G is given by a pair \(N,v\) where \(N\) is the number of players and \(v:2[N]\to\mathbb{R}\) is a characteristic function which maps every coalition of players to a payoff.

Here is something else that I describe to my students:

Let’s assume that Alice, Bob and Celine all share a taxi. They all live in a straight line (with regards to the trajectory of the taxi) and the costs associated with their trip is Alice: £5, Bob: £20, Celine: £39.

What is the fairest way of sharing out the total taxi fair (which would be £39)?

From Alice’s point of view she needs to pay less than £5 (or their would be no point in her sharing the taxi). Similarly for Bob and Celine, however we also want the amount paid by Alice and Bob to be less than if they had shared a taxi etc…

To solve our problem we can use cooperative game theory and in particular use a characteristic function game:

\[ v(c) = \begin{cases} 0 &\text{if } c = \emptyset, \\ 5 &\text{if } c = \{A\}, \\ 20 &\text{if } c = \{B\}, \\ 39 &\text{if } c = \{C\}, \\ 20 &\text{if } c = \{A,B\}, \\ 39 &\text{if } c = \{A,C\}, \\ 39 &\text{if } c = \{B,C\}, \\ 39 &\text{if } c = \{A,B,C\}. \\ \end{cases} \]

This function maps each coalition of players to a value (in particular to their taxi fair). So we see that if Alice and Bob shared a taxi without Celine then their taxi fair would be £20.

It can be shown (I won’t cover that here as I want to get to the Sage code) that the ‘fair’ way of sharing the cost of the taxi is called the Shapley value \(\phi\) which is a vector given by:

\[ \phi_i(G) = \frac{1}{N!} \sum_{\pi\in\Pi_n} \Delta_{\pi}^G(i) \]


\[ \Delta_{\pi}^G(i) = v\bigl( S_{\pi}(i) \cup {i} \bigr) - v\bigl( S_{\pi}(i) \bigr) \]

where \(S_{\pi}(i)\) is the set of predecessors of \(i\) in some permutation of the players \(\pi\), i.e. \(S_{\pi}(i) = \{ j \mid \pi(i) > \pi(j) \}\).

I’ve got a video that describes all this if it’s helpful: Here however I want to give a sneak preview of how to figure out what Alice, Bob and Celine should pay using a future release of Sage:

First of all we need to define the characteristic function:

sage: v = {(): 0,
....:      ('A'): 5,
....:      ('B'): 20,
....:      ('C'): 39,
....:      ('A', 'B'): 20,
....:      ('A', 'C'): 39,
....:      ('B', 'C'): 39,
....:      ('A', 'B', 'C'): 39}

As you can see we do this using a Python dictionary which allows us to map tuples (or indeed coalitions) to values (which is exactly what a characteristic function is).

We then create an instance of the CooperativeGame class (which is what James and I put together):

sage: taxi_game = CooperativeGame(v)

If you tab complete after typing taxi_game. you can see some of the methods and attributes associated with the CooperativeGame class:

sage: taxi_game.
taxi_game.category          taxi_game.dump              taxi_game.is_monotone       taxi_game.nullplayer        taxi_game.rename            taxi_game.shapley_value
taxi_game.ch_f              taxi_game.dumps             taxi_game.is_superadditive  taxi_game.number_players    taxi_game.reset_name        taxi_game.version
taxi_game.db                taxi_game.is_efficient      taxi_game.is_symmetric      taxi_game.player_list

I won’t go in to much of the details of that year but you can get some help on anyone of those by typing ? after one of them (below you can see some of the output):

sage: taxi_game.is_symmetric?
Type:       instancemethod
String Form:<bound method CooperativeGame.is_symmetric of A 3 player co-operative game>
File:       /Users/vince/sage/local/lib/python2.7/site-packages/sage/game_theory/
Definition: taxi_game.is_symmetric(self, payoff_vector)
   Return "True" if "payoff_vector" possesses the symmetry property.

   A payoff vector possesses the symmetry property if v(C cup i) =
   v(C cup j) for all C in 2^{Omega} setminus {i,j}, then x_i =


   * "payoff_vector" -- a dictionary where the key is the player and
     the value is their payoff

But what we really want to know is how much should Alice, Bob and Celine pay the taxi?

To calculate this we simply get Sage to tell us the Shapley value:

sage: taxi_game.shapley_value()
{'A': 5/3, 'B': 55/6, 'C': 169/6}

This shows that in this particular case Alice should pay £1.67, Bob £9.17 and Celine £28.17 (rounding has obviously caused us to gain a penny along the way but you get the idea) :)

This is the first in a series of 3 posts that I’ll get around to writing, in the next one I’ll cover ticket 16331 which takes care of matching games :)

To go back to the process of contributing to an open source project, I really think this is something everyone with any interests in code should have a go at doing as it has a large number of benefits. None more so than improving the standard of code that one writes. When you’re writing because you hope that someone will look at and review your code you make sure it’s well written (or at least try to!).

August 01, 2014 12:00 AM

July 31, 2014

William Stein

SageMathCloud -- history and status

2005: I made first release the SageMath software project, with the goal to create a viable open source free alternative to Mathematica, Magma, Maple, Matlab.

2006: First web-based notebook interface for using Sage, called "sagenb". It was a cutting edge "AJAX" application at the time, though aimed at a small number of users.

2007-2009: Much work on sagenb. But it's still not scalable. Doesn't matter since we don't have that many users.

2011-: Sage becomes "self sustaining" from my point of view -- I have more time to work on other things, since the community has really stepped up.

2012: I'm inspired by the Simons Foundation's (and especially Jim Simon's) "cluelessness" about open source software to create a new online scalable web application to (1) make it easier for people to get access to Sage, especially on Windows, and (2) generate a more longterm sustainable revenue stream to support Sage development. (I was invited to a day-long meeting in NYC at Simon's headquarters.)

2012-2013: Spent much of 2012 and early 2013 researching options, building prototypes, some time talking with Craig Citro and Robert Bradshaw (both at Google), and launched SageMathCloud in April 2013. SMC got some high-profile use, e.g., by UCLA's 400+ student calculus course.

2014: Much development over the last 1.5 years. Usage has also grown. There is some growth information here. I also have useful google analytics data from the whole time, which shows around 4000 unique users per week, with an average session duration of 97 minutes (see attached). Number of users has actually dropped off during the summer, since there is much less teaching going on.

SMC itself is written mostly in CoffeeScript using Node.js on the backend. There's a small amount of Python as well.

It's a highly distributed multi-data center application. The database is Cassandra. The backend server processes are mostly Node.js processes, and also nginx+haproxy+stunnel.

A copy of user data is stored in every data center, and is snapshotted every few minutes, both via :
  • ZFS -- for rolling snapshots that vanish after a month -- and via
  • bup -- for snapshots that remain forever, and are consistent across data centers.
These snapshots are critical for making it possible to trust collaborators on projects to not (accidentally) destroy your work. It is not possible for users to delete the bup snapshots, by design.
Here's what it does: realtime collaborative editing of Latex docs, IPython notebooks, Sage worksheets; use the command line terminal; have several people collaborate on a project (=a Linux account).
The main applications seem to be:
  • teaching courses with a programming or math software components -- where you want all your students to be able to use something, e.g., IPython, Julia, etc, and don't want to have to deal with trying to get them to install said software themselves. Also, you want to easily be able to share files with students, see their work in realtime, etc. It's a much, much easier for people to get going that with naked VM's they have to configure -- and also I provide cross-data center replication.
  • collaborative research mathematics -- all co-authors of a paper work together in an SMC project, both writing the paper there and doing computations.
Active development work right now:
  • course management for homework (etc)
  • administration functionality (mainly motivated by self-hosting and better moderation)
  • easy history slider to see all pasts states of a document
  • switching from bootstrap2 to bootstrap3.

by William Stein ( at July 31, 2014 11:17 PM

Simon Spicer

The average rank of elliptic curves

It's about time I should demonstrate the utility of the code I've written - the aim of the game for my GSoC project, after all, is to provide a new suite of tools with which to conduct mathematical research.

First some background. Given an elliptic curve $E$ specified by equation $y^2 = x^3 + Ax + B$ for integers $A$ and $B$, one of the questions we can ask is: what is the average rank of $E$ as $A$ and $B$ are allowed to vary? Because there are an infinite number of choices of $A$ and $B$, we need to formulate this question a bit more carefully. To this end, let us define the height of $E$ to be
$$ h(E) = \max\{|A|^3,|B|^2\} $$
[Aside: The height essentially measures the size of the coefficients of $E$ and is thus a fairly decent measure of the arithmetic complexity of the curve. We need the 3rd and 2nd powers in order to make the height function scale appropriately with the curve's discriminant.]

We can then ask: what is the limiting value of the average rank of all curves up to height $X$, as $X$ gets bigger and bigger? Is it infinity? Is it 0? Is it some non-trivial positive value? Does the limit even exist? It's possible that the average rank, as a function of $X$ could oscillate about and never stabilize.

There are strong heuristic arguments suggesting that the answer should be exactly $\frac{1}{2}$. Specifically, as the height bound $X$ gets very large, half of all curves should have rank 0, half should have rank 1, and a negligible proportion should have rank 2 or greater.

Even as recently as five years ago this there we knew nothing concrete unconditionally about average curve rank. There are some results by BrumerHeath-Brown and Young providing successively better upper bounds on the average rank of curves ordered by height (2.3, 2 and $\frac{25}{14}$ respectively), but these results are contingent on the Riemann Hypothesis.

However, starting in 2011 Manjul Bhargava, together with Christopher Skinner and Arul Shankar, published a series of landmark papers (see here for a good expository slideshow, and here and here for two of the latest publications) proving unconditionally that average rank - that is, the limiting value of the average rank of all elliptic curves up to height $X$ - is bounded above by 0.885. A consequence of their work too is that at least 66% of all elliptic curves satisfy the rank part of the Birch and Swinnerton-Dyer Conjecture.

To a computationally-minded number theorist, the obvious question to ask is: Does the data support these results? I am by no means the first person to ask this question. Extensive databases of elliptic curves under various orderings have already been compiled, most notably those by Cremona (ordered by conductor) and Stein-Watkins (ordered essentially by discriminant). However, as yet no extensive tabulation of height-ordered elliptic curves has been carried out.

Here is a summarized table of elliptic curves with height at most 10000 - a total of 8638 curves, and the ranks thereof (all computations done in Sage, of course):

Rank# Curves%

Thus the average rank of elliptic curves is 0.816 when the height bound is 10000. This is worrisome: the average is significantly different from the value of 0.5 we're hoping to see.

The situation gets even worse when we go up to height bound 100000:

Rank# Curves%

This yields an average rank of 0.862 for height bound 100000. Bizarrely, the average rank is getting bigger, not smaller!

[Note: the fact that 0.862 is close to Bhargava's asymptotic bound of 0.885 is coincidental. Run the numbers for height 1 million, for example, and you get an average rank of 0.8854, which is bigger than the above asymptotic bound. Observationally, we see the average rank continue to increase as we push out to even larger height bounds beyond this.]

So what's the issue here? It turns out that a lot of the asymptotic statements we can make about elliptic curves are precisely that: asymptotic, and we don't yet have a good understanding of the associated rates of convergence. Elliptic curves, ornery beasts that they are, can seem quite different from their limiting behaviour when one only considers curves with small coefficients. We expect (hope?) that the average to eventually turn around and start to decrease back down to 0.5, but the exact point at which that happens is as yet unknown.

This is where I come in. One of the projects I've been working on (with Wei Ho, Jen Balakrishnan, Jamie Weigandt, Nathan Kaplan and William Stein) is to compute the average rank of elliptic curves for as large a height bound as possible, in the hope that we will get results a bit more reassuring than the above. The main steps of the project are thus:

  1. Generate an ordered-by-height database of all elliptic curves up to some very large  height bound (currently 100 million; about 18.5 million curves);
  2. Use every trick in the book to compute the ranks of said elliptic curves;
  3. Compute the average of said ranks.
Steps 1 and 3 are easy. Step 2 is not. Determining the rank of an elliptic curve is a notoriously hard problem - no unconditional algorithm with known complexity currently exists - especially when you want to do it for millions of curves in a reasonable amount of time. Sage, for example, already has a rank() method attached to their EllipticCurve class; if you pass the right parameters to it, the method will utilize an array of approaches to get a value out that is (assuming standard conjectures) the curve's rank. However, its runtime for curves of height near 100 million is on the order of 20 seconds; set it loose on 18.5 million such curves and you're looking at a total computation time of about 10 CPU years.

Enter GSoC project stage left. At the expense of assuming the Generalized Riemann Hypothesis and the Birch and Swinnerton-Dyer Conjecture, we can use the zero sum rank bounding code I've been working on to quickly compute concrete upper bounds on an elliptic curve's rank. This approach has a couple of major advantages to it:
  • It's fast. In the time it's taken me to write this post, I've computed rank bounds on 2.5 million curves.
  • Runtime is essentially constant for any curve in the database; we don't have to worry about how the method scales with height or conductor. If we want to go up to larger height bounds at a later date, no problem.
As always, some Terms and Conditions apply. The rank bounding code only gives you an upper bound on the rank: if, for example, you run the code on a curve and get the number 4 back, there's no way to determine with this method if the curve's rank is 4, or if it is really some non-negative integer less than 4. 

Note: there is an invariant attached to an elliptic curve called the root number which is efficiently computable, even for curves with large conductor (it took less than 20 minutes to compute the root number for all 18.5 million curves in our database). The root number is one of two values: -1 or +1; if it's -1 the curve has odd analytic rank, and if it's +1 the curve has even analytic rank. Assuming BSD we can therefore always easily tell if the curve's rank is even or odd. My GSoC rank estimation code takes the root number into account, so in the example above, a returned value of 4 tells us that the curve's true rank is one of three values: 0, 2 or 4.

Even better, if the returned value is 0 or 1, we know this must be the actual algebraic rank of the curve: firstly, there's no ambiguity as to what the analytic rank is - it has to the returned 0 or 1; secondly, the BSD conjecture has been proven in the rank 0 & 1 cases. Thus even though we are a priori only computing analytic rank upper bounds, for some proportion of curves we've found the actual algebraic rank.

[Note that the rank bounding code I've written is predicated on knowing that all nontrivial zeros of an elliptic curve $L$-function lie on the critical line, so we still have to assume the Generalized Riemann Hypothesis.]

Thus running the rank bound code on the entire database of curves is a very sensible first step, and it's what I'm currently doing. It's computationally cheap to do - on SageMathCloud, using a Delta value of 1.0, the runtime for a single curve is about 4 milliseconds. Moreover, for some non-negligible percentage of curves the bounds will be observably sharp - based on some trial runs I'm expecting about 20-30% of the computed bounds to be 0 or 1.

That's about 4 million curves for which we won't have to resort to much more expensive rank finding methods. Huge savings!

by Simon Spicer ( at July 31, 2014 08:37 AM

July 28, 2014

Amit Jamadagni


Hello everyone,
The last week we focused on extending the current functionality to links. That involved a lot of refactoring the code. The methods have become more general and work for links. Some of the issues like naming of the methods and storing once a form of representation is calculated are some of the few which remain to be worked upon. This has been a great learning curve for me. Many things were edited which includes addition of the method orientation which gives the signs of the crossings. That was really helpful in formulating the other structures. There was one other issue in the vogel move, this had a case where the pulling of the higher strand onto the lower one did not lead to a generalization on constructing the new crossings.  The issue was resolved as we looked upto the strands in the regions and decided how the crossings would be generated. If the strands were positive, one kind of crossing were obtained and if the strands were negative the other crossings were obtained. I had this thought but I was not sure whether it would work for all cases, Miguel chipped in and gave me the idea in a more concrete setting. So that set the method vogel move up. The rest was revamped and a lot of cleaning has been taking place in the code to remove the stuff which is not necessary. Hope we have most of the functionality by the end. That’s it from me. Thanks for reading through and here is the latest pull request.

by esornep at July 28, 2014 07:12 PM

July 27, 2014

Vince Knight

Game Theory and Pavement Etiquette

Last week the BBC published an article entitled: ‘Advice for foreigners on how Britons walk’. The piece basically discusses the fact that in Britain there doesn’t seem to be any etiquette over which side of the pavement one walks on:

The British have little sense of pavement etiquette, preferring a slalom approach to pedestrian progress. When two strangers approach each other, it often results in the performance of a little gavotte as they double-guess in which direction the other will turn.

Telling people how to walk is simply not British.

But on the street? No, we don’t walk on the left or the right. We are British and wander where we will.

I thought this was a really nice piece and more importantly it is very closely linked to an exercise in game theoretical modelling I’ve used in class.

Let’s consider two people walking along a street. We’ll call one of them Alexandre and the other one Bernard.

Alexandre and Bernard have two options available to them. In game theory we call these strategies and write: \(S=\{L,R\}\) where \(L\) denotes walk on the left and \(R\) denotes walk on the right.

We imagine that Alexandre and Bernard close there eyes and simply walk towards each other making a choice from \(S\). To analyse the outcome of these choices we’ll attribute a value of \(1\) to someone who doesn’t bump in to someone else and \(-1\) if they do bump in to the opposite person.

Thus we write:




We usually represent this situation using two matrices, one showing the utility of each player:

\[ A = \begin{pmatrix} 1&-1\\
-1&1 \end{pmatrix} \] \[ B = \begin{pmatrix} 1&-1\\
-1&1 \end{pmatrix} \]

From these matrices it is easy to read the outcomes of any strategy pairs. If Alexandre plays \(L\) and Bernard plays \(R\) then they both get a utility of \(1\). If both are at that strategy pair then neither has a reason to ‘deviate’ their strategy: this is called a Nash Equilibrium.

Of course though (as alluded to in the BBC article), some people might not always do the same thing. Perhaps Bernard would randomly choose from \(S\). In which case it makes sense to refer to what Bernard does by the mixed strategy \(\sigma_{B}=(x,1-x)\) for \(0\leq x\leq 1\).

If we know that Alexandre is playing \(L\) then Bernard’s utility becomes:




Here is a plot of both these utilities:

With a little bit of work that I’ll omit from this post we can show that there exists another Nash equilibrium which is when both Alexandre and Bernard play \(\sigma=(1/2,1/2)\). At this equilibrium the utility to both players is in fact \(u_{A}(\sigma,\sigma)=u_{B}(\sigma,\sigma)=0\).

This Nash equilibrium is in fact much worse than the other Nash equilibria. In a situation with central control (which if you recall the BBC article is not something that happens on British pavements) then we would be operating with either everyone walking on the left or everyone walking on the right so that the utility would be 1. As this is also a Nash Equilibrium then there would be no reason for anyone to change. Sadly it is possible that we get stuck at the other Nash equilibrium where everyone randomly walks on the right or the left (again: at this point no one has an incentive to move). This idea of comparing worst case Nash Equilibrium to the best possible outcome is referred to as the Price of Anarchy and has a lot to do with my personal research. If it is of interest here is a short video on the subject and here’s a publication that looked at the effect of selfish behaviour in public service systems (sadly that is behind a paywall but if anyone would like to read it please do get in touch).

There are some major assumptions being made in all of the above. In particular two people walking with their eyes closed towards each other is probably not the best way to think of this. In fact as all the people on the pavements of Britain constantly walk around you’d expect them to perhaps learn and evolve how they decide to walk.

This in fact fits in to a fascinating area of game theory called evolutionary game theory. The main idea is to consider multiple agents randomly ‘bumping in to each other’ and playing the game above.

Below are two plots showing a simulation of this (using Python code I describe in this video). Here is the script that makes use of this small agent based simulation script:

import Abm
number_of_agents = 1000  # Size of the population
generations = 100  # Number of generations of players
rounds_per_generation = 5  # How many time everyone from a given generation will `play`
death_rate = .1  # How many weak players we get rid of
mutation_rate = .2  # The chance of a player doing something new
row_matrix = [[1, -1], [-1, 1]]  # The utilities
col_matrix = row_matrix
initial_distribution = [{0: 100, 1: 0}, {0:100, 1:0}]  # The initial distribution which in this case has everyone walking on the left

g = Abm.ABM(number_of_agents, generations, rounds_per_generation, death_rate, mutation_rate, row_matrix, col_matrix, initial_distribution)

We see that if we start with everyone walking on one side of the left side of the pavement then things are pretty stable (using a little bit of algebra this can be shown to be a so called ‘evolutionary stable strategy’):

However, if we start with everyone playing the worst possible Nash Equilibrium (randomly choosing a side of the pavement) then we see that this is not stable and in fact we slowly converge towards the population picking a side of the pavement (this is what is called a non evolutionary stable strategy):

So perhaps there is a chance yet for the British to automagically choose a side of the pavement…

July 27, 2014 12:00 AM

July 21, 2014

Amit Jamadagni


Hello everyone,
The last week we focused on the conversions. Some parts are ready for the knots, in sense the standard input conversion is done but there is lots more to add to it. We are returning the braid word as of now where we need to return the braid itself. That would give access to the other methods like the Seifert Matrix and Alexander polynomial. As of now we are thinking of conversions for the links. The braid to pd code has been edited. We no longer maintain the order the crossings, it is just that we encounter a crossing and we start numbering , previously we used to trace the braid out and then order the crossing accordingly. This week the focus is on converting pd code to braid for links, but it seems that we need to even consider the signs for the crossings,  this might affect the way we have been considering the pd code uptil now. There is one more way we can look at the pd code that is assign four new numbers at each crossing, that would completely change the way we have looked at the pd code and would also call for lot of re implementation. I guess considering the signs should be the possible way out for the pd code of the links. In knots we did not have this problem as it was a more structured structure. This is taking a lot of time than expected. Hopefully we have the implementation for links and all issues resolved by this week. Still there is the invariants which have to be implemented (the conway, homfly) but for now the focus is totally on the conversions. That’s it from me. Thanks for reading through. And here is the pull request for the week.

by esornep at July 21, 2014 09:41 PM

Simon Spicer

How to efficiently enumerate prime numbers

There's a part in my code that requires me to evaluate a certain sum
$$ \sum_{p\text{ prime }< \,M} h_E(p) $$
where $h_E$ is a function related to specified elliptic curve that can be evaluated efficiently, and $M$ is a given bound that I know. That is, I need to evaluate the function h_E at all the prime numbers less than $t$, and then add all those values up.

The question I hope to address in this post is: how can we do this efficiently as $M$ gets bigger and bigger? Specifically, what is the best way to compute a sum over all prime numbers up to a given bound when that bound can be very large?

[For those who have read my previous posts (you can skip this paragraph if you haven't - it's not the main thrust of this post), what I want to compute is, for an elliptic curve $E/\mathbb{Q}$, the analytic rank bounding sum $ \sum_{\gamma} \text{sinc}^2(\Delta \gamma) $ over the zeros of $L_E(s)$ for positive parameter $\Delta$; this requires us to evaluate the sum $ \sum_{n < \exp(2\pi\Delta)} c_n\cdot(2\pi\Delta-\log n)$. Here the $c_n$  are the logarithmic derivative coefficients of the completed $L$-function of $E$. Crucially $c_n = 0$ whenever $n$ isn't a prime power, and we can lump together all the terms coming from the same prime; we can therefore express the sum in the form you see in the first paragraph.]

As with so many things in mathematical programming, there is a simple but inefficient way to do this, and then there are more complicated and ugly ways that will be much faster. And as has been the case with other aspects of my code, I've initially gone with the first option to make sure that my code is mathematically correct, and then gone back later and reworked the relevant methods in an attempt to speed things up.


Here's a Python function that will evaluate the sum over primes. The function takes two inputs: a function $h_E$ and an integer $M$, and returns a value equal to the sum of $h_E(p)$ for all primes less than $M$. We're assuming here that the primality testing function is_prime() is predefined.

As you can see, we can achieve the desired outcome in a whopping six lines of code. Nothing mysterious going on here: we simply iterate over all integers less than our bound and test each one for primality; if that integer is prime, then we evaluate the function h_E at that integer and add the result to y. The variable y is then returned at the end.

Why is this a bad way to evaluate the sum? Because there are far more composite integers than there are primes. According to the prime number theorem, the proportion of integers up to $M$ that are prime is approximately $\frac{1}{\log M}$. For my code I want to compute with bounds in the order of $M = e^{8\pi} \sim 10^{11}$; the proportion of integers that are prime up to this bound value is correspondingly about $\frac{1}{8\pi} \sim 0.04$. That is, 96% of the integers we iterate over aren't prime, and we end up throwing that cycle away.

Just how inefficient this method is of course depends on how quickly we can evaluate the primality testing function is_prime(). The best known deterministic primality testing algorithm has running time that scales with (at most) the 6th power of $\log n$, where $n$ is the number being tested. This places primality testing in a class of algorithms called Polynomial Time Complexity Algorithms, which means the runtime of the function scales relatively well with the size of the input. However, what kills us here is the sheer number of times we have to call is_prime() - on all integers up to our bound $M$ - so even if it ran in constant time the prime_sum() function's running time is going to scale with the magnitude of $M$.


We can speed things up considerably by noting that apart from 2, all prime numbers are odd. We are therefore wasting a huge amount of time running primality tests on integers that we know a priori are composite. Assuming is_prime() takes a similar time to execute than our coefficient function h_E(), we could therefore roughly halve the runtime of the prime sum function by skipping the even integers and just checking odd numbers for primality.

We can go further. Apart from 2 and 3, all primes yield a remainder of 1 or 5 when you divide them by 6 (because all primes except for 2 are 1 (modulo 2) and all primes except for 3 are 1 or 2 (modulo 3)). We can therefore skip all integers that are 0, 2, 3 or 4 modulo 6; this means we only have to check for primality on only one third of all the integers less than $M$.

Here's a second version of the prime_sum() function that does this:

Of course we could go even further with the technique by looking at remainders modulo $p$ for more primes $p$ and combining the results: for example, all primes outside of 2, 3 and 5 can only have a remainder of 7, 11, 13, 17, 19, 23 or 29 modulo 30. However, the further you go the more special cases you need to consider, and the uglier your code becomes; as you can see, just looking at cases modulo 6 requires us to write a function about three times as long as previously. This method therefore will only be able to take us so far before the code we'd need to write would become too unwieldy for practicality.


This second prime_sum() version is a rudimentary example of a technique called prime sieving. The idea is to use quick computations to eliminate a large percentage of integers from consideration in a way that doesn't involve direct primality testing, since this is computationally expensive. Sieving techniques are an entire field of research in their own right, so I thought I'd just give as example one of the most famous methods: the Sieve of Eratosthenes (named after the ancient Greek mathematician who is thought to first come up with the idea). This takes as input a positive bound $M$ and returns a list of all prime numbers less than $M$. The method goes as follows:
  1. Start with a list of boolean flags indexed by the numbers 2 through $M$, and set all of them to True. 
  2. Starting at the beginning of the list, let $i$ be the index of the first True entry. Set all entries at indices a multiples of $i$ to False.
  3. Repeat step 2 until the first True entry is at index $> \sqrt{M}$.
  4. Return a list of all integers $i$ such that the entry at index $i$ is True.
This is definitely a case where a (moving) picture is worth a thousand words:
A good graphic representation of the Sieve of Eratosthenes being used to generate all primes less than 121. Courtesy Wikipedia: "Sieve of Eratosthenes animation". Licensed under CC BY-SA 3.0 via Wikimedia Commons
How and why does this work? By mathematical induction, at each iteration the index of the first True entry will always be prime. However any multiple thereof is by definition composite, so we can walk along the list and flag them as not prime. Wash, rinse, repeat. We can stop at $\sqrt{M}$, since all composite numbers at most $M$ in magnitude must have at least one prime factor at most $\sqrt{M}$ in size.

Here is a third version of our prime_sum() function that utilizes the Sieve of Eratosthenes:

Let's see how the three versions stack up against each other time-wise in the Sage terminal. I've saved the three functions in a file called, which I then import up front (if you want to do the same yourself, you'll need to import or define appropriate is_prime() and sqrt() functions at the top of the file). I've also defined a sample toy function h_E() and bound M:

sage: from prime_sum_functions import *
sage: def h_E(n): return sin(float(n))/float(n)
sage: M = 10000
sage: prime_sum_v1(h_E,M)
sage: prime_sum_v2(h_E,M)
sage: prime_sum_v3(h_E,M)
sage: %timeit prime_sum_v1(h_E,M)
1 loops, best of 3: 363 ms per loop
sage: %timeit prime_sum_v2(h_E,M)
1 loops, best of 3: 206 ms per loop
sage: %timeit prime_sum_v3(h_E,M)
10 loops, best of 3: 86.8 ms per loop

Good news! All three functions (thankfully) produce the same result. And we see version 2 is about 1.8 times faster than version 1, while version 3 is four times as fast. These ratios remained roughly the same when I timed the functions on larger bounds, which indicates that the three versions have the same or similar asymptotic scaling - this should be expected, since no matter what we do we will always have to check something at each integer up to the bound.


It should be noted, however, that the Sieve of Eratosthenes as implemented above would be a terrible choice for my GSoC code. This is because in order to enumerate the primes up to $M$ we need to create a list in memory of size $M$. This isn't an issue when $M$ is small, but for my code I need $M \sim 10^{11}$; an array of booleans that size would take up about 12 gigabytes in memory, and any speedups we get from not having to check for primality would be completely obliterated by read/write slowdowns due to working with an array that size. In other words, while the Sieve of Eratosthenes has great time complexity, it has abysmal space complexity.

Thankfully, more memory-efficient sieving methods exist that drastically cut down the space requirements. The best of these - for example, the Sieve of Atkin - need about $\sqrt{M}$ space. For $M \sim 10^{11}$ this translates to only about 40 kilobytes; much more manageable.

Of course, there's always a downside: bleeding edge prime enumeration methods are finicky and intricate, and there are a plethora of ways to get it wrong when implementing them. At some point squeezing an extra epsilon of speedup from your code is no longer worth it in terms of the time and effort it will take to get there. For now, I've implemented a more optimized version of the second prime_sum() function in my code (where we skip over all integers that are obviously not prime), since for now that is my happy middle ground.  If I have time at the end of the project I will revisit the issue of efficient prime enumeration and try implement a more optimized sieving method, but that is a tomorrow problem.

by Simon Spicer ( at July 21, 2014 04:00 PM

July 19, 2014

Vince Knight

Using Github pages and Python to distribute my conference talks

I’m very conscious about wanting to share as much of what I do as a research/instructor in as easy a way as possible. One example of this is the slides/decks that I use for talks I give. 3 or 4 years ago I was doing this with a Dropbox link. More recently this has lead to me putting everything on github but this wasn’t ideal as I’ve started to accumulate a large number of repositories for the various talks I give.

One of the great things about using github though is that for html presentations (reveal.js is the one I’ve used a couple of times), you can use the github deployement branch gh-pages to serve the files directly. A while back I put a video together showing how to do this:

So there are positives and negatives. After moving to a jekyll framework for my blog I started thinking of a way of getting all the positives without any negatives…

  • I want to use a git+github workflow;
  • I don’t want to have a bunch of separate repositories anymore;
  • I want to be able to just write my talks and they automatically appear online.

My immediate thought was to just use jekyll but as far as I can tell I’d have to hack it a bit to get blog posts be actual .pdf files (for my beamer slides) (please tell me if I’m wrong!). I could of course write a short blog post with a link to the file but this was one extra step to just writing my talk that I didn’t want to have to do. Instead of hacking jekyll a bit I decided to write a very simple Python script. You can see it here but I’ll use this blog post to just walk through how it works.

I have a Talks repository:

~$ cd Talks
Talks$ ls


What you can see in there are 3 directories (each starting with a date) and various other files. In each of the talk directories I just have normal files for those talks:

Talks$ cd 2014-07-25-Measuring-the-Price-of-Anarchy-in-Critical-Care-Unit-Interactions/
...$ ls

2014-07-25-Measuring-the-Price-of-Anarchy-in-Critical-Care-Unit-Interactions.nav    2014-07-25-Measuring-the-Price-of-Anarchy-in-Critical-Care-Unit-Interactions.snm
2014-07-25-Measuring-the-Price-of-Anarchy-in-Critical-Care-Unit-Interactions.pdf    2014-07-25-Measuring-the-Price-of-Anarchy-in-Critical-Care-Unit-Interactions.tex

I can work on those slides just as I normally would. Once I’m ready I go back to the root of my Talks directory and run the script:

Talks$ python

This file automatically goes through my sub-directories reading the date from the directory names and identifying .html or .pdf files as talks. This creates the index.html file which is an index of all my talks (sorted by date) with a link to the right file. To get the site online you simply need to push it to the gh-pages branch of your github repository.

You can see the site here: Clicking on relevant links brings up the live version of my talk, or at least as live as my latest push to the github gh-pages branch.

The python script itself is just:

 1 #!/usr/bin/env python
 2 """
 3 Script to index the talks in this repository and create an index.html file.
 4 """
 5 import os
 6 import glob
 7 import re
 9 root = "./"
10 directories = sorted([name for name in os.listdir(root) if os.path.isdir(os.path.join(root, name))], reverse=True)
11 talk_formats = ['.pdf', '.html']
14 index = open('index.html', 'w')
15 index.write(open('head.html', 'r').read())
16 index.write(open('header.html', 'r').read())
18 index.write("""
19             <body>
20             <div class="page-content">
21             <div class="wrap">
22             <div class="home">
23             <ul class='posts'>""")
25 for dir in directories:
26     if dir not in ['.git', 'reveal.js']:
27         talk = [f for f in glob.glob(root + dir + "/" + dir[:10] + '*') if  os.path.splitext(f)[1] in talk_formats][0]
28         date = talk[len(root+dir) + 1: len(root + dir) + 11]
29         title, extension =  os.path.splitext(talk[len(root+dir)+11:].replace("-", " "))
30         index.write("""
31                     <li>
32                     <span class="post-date">%s [%s]</span>
33                     <a class="post-link" href="%s">%s</a>
34                     <p>
35                     <a href="%s">(Source files)</a>
36                     </p>
37                     </li>
38                     """ % (date, extension[1:], talk, title, '' + dir ))
39 index.write("""
40             </ul>
41             </div>
42             </div>
43             </div>
44             """)
45 index.write(open('footer.html', 'r').read())
46 index.write("</body>")
47 index.close()

The main lines that do anything are lines 25-38, everything else just reads in the relevant header and footer files.

So now getting my talks written and online is hardly an effort at all:

# Write awesome talk
Talks$ git commit -am 'Finished talk on proof of finite number of primes'  # This I would do anyway
Talks$ python  # This is the 1 extra thing I need to do
Talks$ git push origin  # This I would do anyway

There are various things that could be done to improve this (including pushing via as an option) and I’m not completely convinced I can’t just use jekyll for this but it was quicker to write that script then to figure it out (or at least that was my conclusion after googling twice).

If anyone has any fixes/improvements (including: “You idiot just run jekyll with the build-academic-conference-talk-site flag”) that’d be super appreciated and if you want to see the Talk repository (with python script, css files, header.html etc…) it’s here:

Now to finish writing my talk for ORAHS2014

July 19, 2014 12:00 AM

July 15, 2014

Vince Knight

Three Days: Two Higher Ed Conferences

Last week I took part in two conferences on the subject of higher education and so here’s a blog post with some thoughts and reflections.

Monday and Tuesday: ‘Workshop on Innovations in University Mathematics Teaching’

The first conference was organised by +Paul Harper, Rob Wilson and myself. The conference website can be found here. The main subject of this was active learning pedagogic techniques, in particular:

  • The flipped classroom;
  • Inquiry Based Learning (IBL).

The plan for these two days included an almost full day of talks on the Monday and an interactive IBL session on the Tuesday.

Here’s a couple of snippets from each session:

  • Robert Talbert gave the opening talk describing the flipped learning environment. You can see his slides here.

    I was quite nervous about Robert’s talk as he’s an expert in flipping classrooms and I was scheduled to talk after him. He gave a great talk (in fairness every single talk on the day was awesome) and here are a couple of things I noted down:

    A flipped classroom does not imply a flipped learning environment!

    A traditional classroom encourages the dependency of a student on the instructor.

    Flipped learning is not just videos out of class and homework in class.

    My favourite:

    The most important part of the flipped classroom is not what happens outside of the classroom (videos etc…) but what happens inside the classroom.

  • I spoke next and you can find my slides here.

    I mainly spoke about the programming course that I teach using a flipped class to our first years. I didn’t want to go in to any details about what a flipped learning environment is as I would most certainly not have been able to do it justice after Robert’s talk so I just gave an exemplar of it in practice. I might blog about the particular approach I used another time.

  • Toby Bailey then gave an excellent talk about the flipped classroom / peer instruction that he has in place for a large class.

    One of the highlights was certainly a video of his class in which we saw students respond to a question via in class clickers and then break in to groups of 2 to discuss the particular problem and finally answer the question one more time. Responses were then put up for all to see and it was great to see that students were indeed improving (you could see the distributions of clicker answers improve after the peer instruction).

    Here are a couple of other things I noted down during the talk:

    It’s not all about the lecturer.

    The importance of getting out of the way.

    Tell the class why you are doing it.

  • Stephen Rutherford then spoke about his flipped classroom in biosciences.

    This was a great talk and it was very neat to have a non mathematical point of view. My first highlight from Steve’s talk can be seen in the photo above. I think that that fundamental question (‘why am I better than a book’) could in fact help improve the instruction of many.

    A flipped classroom allows some control to be put in the hands of the students.

    The reason students are at university is to get an education and not necessarily a degree.

  • We then moved on to a relaxed panel discussion about the flipped classroom, one of the things that I think was a big highlight of that was the importance of involving students in the pedagogic reasoning behind whatever approach is used in a class.

  • The final ‘talk’ of the day was by Chris Sangwin who talked about his Moore Method class.

    This was a fascinating talk as Chris clearly described the success he has had with implementing a Moore Method class.

    In particular he highlighted the importance of he role of the instructor in this framework where students are given a set of problems to work through and present to their peers (there is no lecturing in a Moore method class).

    Some highlights:

    In 2007, after his class finished students found the book from which his problems originated and continued to work through them on their own.

    In 2008, students set up a reading group and started to read complex mathematical topics.

  • The rest of the conference was a natural continuation from Chris’s talk as Dana Ernst and TJ Hitchman spoke about Inquiry Based Learning (a pedagogic term that encompasses the Moore method - I’m sure someone can correct me if I got that subtlety wrong).

    This was a really great interactive session that ran over to Tuesday. There is far too much that happened in this session and it was hard to take notes as we were very much involved but here is some of the things that stuck for me.

    1. TJ ran a great first little session that basically got us to think about what we want to be as educators. One of the main things that came out of the ‘what do you want your students to remember in 20 years time’ question was that very few of us (I’m not sure if anyone did) mentioned the actual content of the courses we teach.
    2. The importance of creating a safe environment in which students can fail (in order to learn). Productive failure.
    3. The various difficulties associated with implementing an IBL approach due to class size (this was a recurring theme with regards to UK vs US class sizes).

    Another important point was the criteria that defines an IBL approach:

    Students are in charge of not only generating the content but also critiquing the content.

    You can find all of Dana and TJ’s content on their github repository.

After this session I enjoyed a good chat with TJ who helped me figure out how to make my R/SAS course better. After that my project student who will be working with me to evaluate my flipped classroom had a great talk with Robert, TJ and Dana who gave some really helpful advice. One of the highlights that came out of this was Robert putting very simply what I believe defines an effective pedagogic approach. Hopefully Robert will either correct me or forgive me for paraphrasing (EDIT: He has corrected me in the comments):

Whatever the approach: flipped classroom, IBL, interpretive dance, as long as the system allows you to empower your students and monitor how they are learning it is worth doing.

I’m probably forgetting quite a few details about the workshop (including the 6+ course conference dinner which was pretty awesome). Now to describe the next conference which was the Cardiff University Annual Learning and Teaching Conference

Wednesday: Cardiff University Annual Learning and Teaching Conference

This was my first time attending this conference and I was lucky enough to have my abstract accepted so I was able to give a talk.

You can find my slides here.

In all honesty I was kind of tired so I didn’t take as detailed notes as I would like and/or as many photos but here are some highlights:

  • I enjoyed Stephen Rutherford discussing the plans of the Biosciences school to bring in peer assessment:

    Assessment for learning and not of learning

  • I liked Anne Cunningham reminding everyone that students obtain content from a variety of sources when talking about their using of

    The curators are not just the staff. The prime curators are the students.

  • Rob Wilson and Nathan Roberts gave an overview of the tutoring system that we as a university will be heading towards.

A great three days

This brings my attendance at education conference to a total of 3 and I must say that I can’t wait to go to the next one (incidentally my abstract for the CETL-MSOR conference got accepted today). I really enjoy the vibe at education conferences as there is a slight sense of urgency in case anyone says anything that someone might be able to use/adapt/steal so as to improve their teaching and have a real impact on our students’ lives.

Two final links:

  • The #innovcardiff hashtag if you would like to see what was being said online about the innovation conference (big applause to Calvin Smith who did a tremendous job on there!);
  • The #cardiffedu hashtag if you would like to see the same for the Cardiff University education conference.

If anyone who attended the conferences has anything to add it would be great to hear from you and if anyone couldn’t make it but would like to know more: please get in touch :)

July 15, 2014 12:00 AM

July 14, 2014

Simon Spicer


I'm at the stage where my code essentially works: it does everthing I initially set out to have it do, including computing the aforementioned zero sums for elliptic curve $L$-functions. However, the code is written in pure Python, and it is therefore not as fast as it could be.

This is often not a problem; Python is designed to be easy to read and maintain, and I'm hoping that the Python code I wrote is both of those. If we were just planning to run it on elliptic curves with small coefficients - for example, the curve represented by the equation $y^2=x^3-16x+16$ - this wouldn't be an issue. Curves with small coefficients have small conductors and thus few low-lying zeros near the central point, which allows us to run the zero sum code on them with small Delta parameter values. A small Delta value means the computation will finish very quickly regardless of how efficiently it's implemented, so it probably wouldn't be worth my while trying to optimize the code in that case.

To illustrate this point, here is the first most high-level, generic version of the method that computes the sum $\sum_{\gamma} \text{sinc}^2(\Delta \gamma)$ over the zeros of a given elliptic curve $L$-function (minus documentation):

[Of course, there's plenty going on in the background here. I have a separate method, which computes the logarithmic derivative coefficients, and I call the SciPy function spence() to compute the part of the sum that comes from the Fourier transform of the digamma function $\frac{\Gamma^{\prime}}{\Gamma}$. Nevertheless, the code is simple and straightforward, and (hopefully) it's easy to follow the logic therein.]

However, the whole point of my GSoC project is to produce code that can be used for mathematical research; ultimately we want to push computations as far as we can and run the code on elliptic curves with large conductor, since curves with small conductor are already well-understood. Ergo, it's time I thought about going back over what I've written and seeing how I can speed it up.

There are two distinct ways to achieve speedups. The first is to rewrite the code more cleverly to eliminates unnecessary loops, coercions, function calls etc. Here is a second version I have written of the same function (still in Python):

The major change I've made between the two versions is improving how the sum involving the logarithmic derivative coefficients is computed - captured in the variable y. In the first version, I simply iterated over all integers $n$ up to the bound $t$, calling the method each time. However, the logarithmic derivative coefficient $c_n$ is zero whenever $n$ is not a prime power, and knowing its value for $n=p$ a prime allows us to efficiently compute its value for $p^k$ any power of that prime. It therefore makes sense to do everything "in-house": eliminate the method call to, iterate only over primes, and compute the logarithmic derivative coefficients for all powers of a given prime together.

Let's see how the methods match up in terms of speed. Below we run the two versions of the zero sum method on the elliptic curve $E: y^2=x^3-16x+16$, which is a rank 1 curve of conductor 37:

sage: import sage.lfunctions.zero_sums as zero_sums
sage: ZS = zero_sums.LFunctionZeroSum(EllipticCurve([-16,16]))
sage: ZS._zerosum_sincsquared(Delta=1)
sage: ZS._zerosum_sincsquared_fast(Delta=1)
sage: %timeit ZS._zerosum_sincsquared(Delta=1)
10 loops, best of 3: 20.5 ms per loop
sage: %timeit ZS._zerosum_sincsquared_fast(Delta=1)
100 loops, best of 3: 3.46 ms per loop

That's about a sixfold speedup we've achieved, just by reworking the section of the code that computes the $c_n$ sum.

The downside, of course, is that the code in method version 2 is more complicated - and thus less readable - than that in version 1. This is often the case in software development: you can write code that is elegant and easy to read but slow, or you can write code that is fast but horribly complicated and difficult to maintain. And when it comes to mathematical programming, unfortunately, sometimes the necessity for speed trumps readability.

The second major way to achieve speedups is to abandon pure Python and switch to a more low-level language. I could theoretically take my code and rewrite it in C, for example; if done relatively intelligently I'm sure the resulting code would blow what I have here out the water in terms of speed. However, I have no experience writing C code, and even if I did getting the code to interface with the rest of the Sage codebase would be a major headache.

Thankfully there is a happy middle ground: Cython. Cython is a programming language - technically a superset of Python - that allows direct interfacing with C and C++ data types and structures. Code written in Cython can be as fast as C code and nearly as readable as pure Python. Crucially, because it's so similar to Python it doesn't require rewriting all my code from scratch. And Sage already knows how to deal with Cython, so there aren't any compatibility issues there.

I am therefore in the process of doing exactly that: rewriting my code in Cython. Mostly this is just a cut-and-paste job and is pleasingly hassle-free; however, in order to achieve the desired speedups, the bottleneck methods - such as our $\text{sinc}^2$ zero sum method above - must be modified to make use of C data types.

Here is the third, most recent version of the _zerosum_sincsquared() method for our zero sum class, this time written in Cython:

Definitely longer and uglier. I now must declare my (C) variables up front; previously Python just created them on the fly, which is nice but slower than allocating memory space for the variables a priori. I've eliminated the use of complex arithmetic, so that everything can be done using C integer and float types. I still iterate over all primes up to the bound $t$; however now I deal with those primes that divide the conductor of $E$ (for which the associated summand is calculated slightly differently) beforehand, so that in the main loop I don't have to check at each point if my prime $p$ divides the conductor or not [This last check is expensive: the conductor $N$ can be very large - too large to cast as a $C$ long long even - so we would have to use slower Python or Sage data types to represent it. Better to get rid of the check altogether].

Let's see how this version holds up in a speed test. The Cython code has already been built into Sage and the class loaded into the global namespace, so I can just call it without having to attach or load any file:

sage: ZS = LFunctionZeroSum(EllipticCurve([-16,16]))
sage: ZS._zerosum_sincsquared_fast(Delta=1)
sage: %timeit ZS._zerosum_sincsquared_fast(Delta=1)
1000 loops, best of 3: 1.72 ms per loop

The good news: the Cythonized version of the method produces the same output as the Python versions, and it's definitely faster. The bad news: the speedup is only about a factor of 2, which isn't hugely impressive given how much uglier the code is.

Why is this? Crucially, we still iterate over all integers up to the bound $t$, testing for primality at each step. This is very inefficient: most integers are not prime (in fact, asymptotically 0 percent of all positive integers are prime); we should be using sieving methods to eliminate primality testing at those integers which we know before checking are composite. For example, we should at the very least only ever iterate over the odd numbers beyond 3. That immediately halves the number of primality tests we have to do, and we should therefore get a comparable speedup if primality testing is what is dominating the runtime in this method.

This is therefore what I hope to implement next: rework the zero sum code yet again to incorporate prime sieving. This has applicability beyond just the $\text{sinc}^2$ method: all explicit formula-type sums at some point invoke summing over primes or prime powers, so having access to code that can do this quickly would be a huge boon.

by Simon Spicer ( at July 14, 2014 12:37 PM

July 13, 2014

Amit Jamadagni


Hello everyone,
This week we have made an attempt at implementing the Jones Polynomial. We have used the trip matrix of the knot to determine the Jones polynomial. The trip matrix of a knot is determined by the following process. We number the crossings randomly, and we start moving along the knot, let T be the matrix and T ij  be the elements. Now we start with the crossing i and see how many times we have encountered the crossing j until we return to the crossing i. We take mod2 of this value and fill that matrix element. So in this way we construct all the elements except the diagonal elements. For the diagonal elements we see whether i is a positive cross or negative cross. If it is a positive cross we fill it with zero and for the negative cross we fill it with 1. Now we have the initial trip matrix. To evaluate the Jones polynomial we smooth the crossings until we have a link for which we know the Kauffman’s bracket. So this decomposition here is looked by the matrix. So for the initial diagonal elements of the trip matrix we assign a certain type of smoothing and determine the number of seifert circles. Now we construct a new matrix by doing the following, we choose a crossing and smooth it in another way(different from the first), the only elements which are different from the initial matrix are the diagonal elements and the only element which changes when we do such kind of a smoothing is the crossing number element. In sense if we change the smoothing at crossing i we change the number at the matrix element T ii   (this is flipped from either one to zero or zero to one). Again we continue this until all the options are exhausted. Then for every matrix we have certain coefficients of the jones polynomial. Adding all these up gives the jones polynomial for the knot. I know it is tough to follow but that is the gist of the algorithm. I have followed the following material and I request the readers to have a look at them for greater understanding.

First Reference:

Second Reference :
A matrix for computing the Jones Polynomial of a Knot by Louis Zulli

We have dedicated some time for documenting the code that we coded till now. There have been some edge cases where the code showed some inconsistency. We are working on the edge cases as well as cleaning the code alongside continuing the implementation of the invariants.
Here is the pull request for the week:
Here is the entire code uptil now:

by esornep at July 13, 2014 11:55 PM

July 11, 2014

Nikhil Peter

Sage Android – UI Update

It’s been a busy week so far in the land of UI improvements. Disclaimer: I’m pretty bad at UI, so any and all suggestions are welcome. Firstly, last week’s problems have been resolved viz. Everything is saved nicely on orientation change, including the interacts, which required quite a bit of effort. In short, it involved […]

by hav3n at July 11, 2014 05:59 PM

July 08, 2014

Simon Spicer

The Birch and Swinnerton-Dyer Conjecture and computing analytic rank

Let $E$ be an elliptic curve with $L$-function $L_E(s)$. Recall that Google Summer of Code project is to implement in Sage a method that allows us to compute $\sum_{\gamma} f(\Delta \gamma)$, where $\gamma$ ranges over the imaginary parts of the nontrivial zeros of $L_E$, $\Delta$ is a given positive parameter, and $f(x)$ is a specified symmetric continuous integrable function that is 1 at the origin. The value of this sum then bounds the analytic rank of $E$ - the number of zeros at the central point - from above, since we are summing $1$ with multipliticy $r_{an}$ in the sum, along with some other nonzero positive terms (that are hopefully small). See this post for more info on the method.

One immediate glaring issue here is that zeros that are close to the critical point will contribute values that are close to 1 in the sum, so the curve will then appear to have larger analytic rank than it actually does. An obvious question, then, is to ask: how close can the noncentral zeros get to the central point? Is there some way to show that they cannot be too close? If so, then we could figure out just how large of a $\Delta$ we would need to use in order to get a rank bound that is actually tight.

The rank zero curve $y^2 = x^3 -  x^3 - 7460362000712 x - 7842981500851012704$ has an extremely low-lying zero at $\gamma = 0.0256$ (and thus another one at $-0.0256$; as a result the zero sum looks like it's converging towards a value of 2 instead of the actual analytic rank of zero. In order to actually get a sum value out that's less than one we would have to use a $\Delta$ value of about 20; this is far beyond what's feasible due to the exponential dependence of the zero sum method on $\Delta$.

The good news is that there is hope in this regard; the nature of low-lying zeros for elliptic curve $L$-functions is actually the topic of my PhD dissertation (which I'm still working on, so I can't provide a link thereto just yet!). In order to understand how close the lowest zero can get to the central point we will need to talk a bit about the BSD Conjecture.

The Birch and Swinnerton-Dyer Conjecture is one of the two Clay Math Millenium Problems related to $L$-functions. The conjecture is comprised of two parts; the first part I mentioned briefly in this previous post. However, we can use the second part to gain insight into how good our zero sum based estimate for analytic rank will be.

Even though I've stated the first part of the BSD Conjecture before, for completeness I'll reproduce the full conjecture here. Let $E$ be an elliptic curve defined over the rational numbers, e.g. a curve represented by the equation $y^2 = x^3 + Ax + B$ for some integers $A$ and $B$ such that $4A^3+27B^2 \ne 0$. Let $E(\mathbb{Q})$ be the group of rational points on the elliptic curve, and let $r_{al}$ be the algebraic rank of $E(\mathbb{Q})$. Let $L_E(s)$ be the $L$-function attached to $E$, and let $L_E(1+s) = s^{r_{an}}\left[a_0 + a_1 s + \ldots\right]$ be the Taylor expansion of $L_E(s)$ about $s=1$ such that the leading coefficient $a_0$ is nonzero; $r_{an}$ is called the analytic rank of $E$ (see here for more details on all of the above objects). The first part of the BSD conjecture asserts that $r_{al}=r_{an}$; that is, the order of vanishing of the $L$-function about the central point is exactly equal to the number of free generators in the group of rational points on $E$.

The second part of the conjecture asserts that we actually know the exact value of that leading coefficient $a_0$ in terms of other invariants of $E$. Specifically:
$$ a_0 = \frac{\Omega_E\cdot\text{Reg}_E\cdot\prod_p c_p\cdot\#\text{Ш}(E/\mathbb{Q})}{(\#E_{\text{Tor}}(\mathbb{Q}))^2}. $$

Fear not if you have no idea what any of these quantities are. They are all things that we know how to compute - or at least estimate in size. I provide below brief descriptions of each of these quantities; however, feel free to skip this part. It suffices to know that we have a formula for computing the exact value of that leading coefficient $a_0$.
  1. $\#E_{\text{Tor}}(\mathbb{Q})$ is the number of rational torsion points on $E$. Remember that the solutions $(x,y)$ to the elliptic curve equation $y^2 = x^3 + Ax+B$, where $x$ and $y$ are both rational numbers, form a group. Recall also that that the group of rational points $E(\mathbb{Q})$ may be finite or infinite, depending on whether the group has algebraic rank zero, or greater than zero. However, it turns out that there are only ever finitely many torsion points - those which can be added to themselves some finite number of times to get the group identity element. These points of finite order form a subgroup, denoted $E_{\text{Tor}}(\mathbb{Q})$, and the quantity in question is just the size of this finite group (squared in the formula). In fact, it's been proven that the size of $E_{\text{Tor}}(\mathbb{Q})$ is at most 16.
  2. $\Omega_E$ is the real period of $E$. This is perhaps a bit more tricky to define, but it essentially is a number that measures the size of the set of real points of $E$. If you plot the graph of the equation representing $E: y^2 = x^3 + Ax + B$ on the cartesian plane, you get something that looks like one of the following:

    The plots of the real solutions to four elliptic curves, and their associated real periods.

    There is a way to assign an intrinsic "size" to these graphs, which we denote the real period $\Omega_E$. The technical definition is that $\Omega_E$ is equal to the integral of the absolute value of the differential $\omega = \frac{dx}{2y}$ along the part of the real graph of $E$ that is connected to infinity (that or twice that depending on whether the cubic equation $x^3 + Ax + B$ has one or three real roots respectively).
  3. $\text{Reg}_E$ is the regulator of $E$. This is a number that measures the "density" of rational points on $E$. Recall that $E(\mathbb{Q}) \approx T \times \mathbb{Z}^{r_{an}}$, i.e there free part of $E(\mathbb{Q})$ is isomorphic to $r_{an}$ copies of the integers. There is a canonical way to embed the free part of $E(\mathbb{Q})$ in $\mathbb{R}^{r_{an}}$ as a lattice; the regulator $\text{Reg}_E$ is the volume of the fundamental domain of this lattice. The thing to take away here is that elliptic curves with small regulators have lots of rational points whose coefficients have small numerators and denominators, while curves with large regulators have few such points.
  4. $\prod_p c_p$ is the Tamagawa product for $E$. For each prime $p$, one can consider the points on $E$ over the $p$-adic numbers $\mathbb{Q}_p$. The Tamagawa number $c_p$ is the ratio of the size of the full group of $p$-adic points on $E$ to the subgroup of $p$-adic points that are connected to the identity. This is always a positive integer, and crucially, in all but a finite number of cases the ratio is 1. Thus we can consider the product of the $c_p$ as we range over all prime numbers, and this is precisely the definition of the Tamagawa product.
  5. $\#\text{Ш}(E/\mathbb{Q})$ is the order of the Tate-Shafarevich group of $E$ over the rational numbers. The Tate-Shafarevich group of $E$ is probably the most mysterious part of the BSD formula; it is defined as the subgroup of the Weil–Châtelet group $H^1(G_{\mathbb{Q}},E)$ that becomes trivial when passing to any completion of $\mathbb{Q}$. If you're like me then this definition will be completely opaque; however, we can think of $\text{Ш}(E/\mathbb{Q})$ as measuring how much $E$ violates the local-global principle: that one should be able to find rational solutions to an algebraic equation by finding solutions modulo a prime number $p$ for each $p$, and then piecing this information together with the Chinese Remainder Theorem to get a rational solution. Curves with nontrivial $\text{Ш}$ have homogeneous spaces that have solutions modulo $p$ for all $p$, but no rational points. The main thing here is that $\text{Ш}$ is conjectured to be finite, in which case $\#\text{Ш}(E/\mathbb{Q})$ is just a positive integer (in fact, it can be shown for elliptic curves that if $\text{Ш}$ is indeed finite, then its size is a perfect square).
Why is the BSD Conjecture relevant to rank estimation? Because it helps us overcome the crucial obstacle to computing analytic rank exactly: without extra knowledge, it's impossible to decide using numerical techniques whether the $n$th derivative of the $L$-function at the central point is exactly zero, or just so small that it looks like it is zero to the precision that we're using. If we can use the BSD formula to show a priori that $a_0$ must be at least $M_E$ in magnitude, where $M_E$ is some number that depends only on some easily computable data attached to the elliptic curve $E$, then all we need to do is evaluate successive derivatives of $L_E$ at the central point to enough precision to decide if that derivative is less than $M_E$ or not; this is readily doable on a finite precision machine. Keep going until we hit a derivative which is then definitely greater than $M_E$ in magnitude, at which we can halt and declare that the analytic rank is precisely the order of that derivative.

In the context of the explicit formula-based zero sum rank estimation method implemented in our GSoC project, the size of the leading coefficient also controls how far close the lowest noncentral zero can be from the central point. Specifically, we have the folling result: Let $\gamma_0$ be the lowest-lying noncentral zero for $L_E$ (i.e. the zero closest to the central point that is not actually at the central point); let $\Lambda_E(s) = N^{s/2}(2\pi)^s\Gamma(s)L_E(s)$ is the completed $L$-function for $E$, and let $\Lambda_E(1+s) = s^{r_{an}}\left[b_0 + b_1 s + b_2 s^2 \ldots\right]$ be the Taylor expansion of $\Lambda_E$ about the central point. Then:
$$ \gamma_0 > \sqrt{\frac{b_0}{b_2}}. $$
Thankfully, $b_0$ is easily relatable back to the constant $a_0$, and we have known computable upper bounds on the magnitude of $b_2$, so knowing how small $a_0$ is tells us how close the lowest zero can be to the central point. Thus bounding $a_0$ from below in terms of some function of $E$ tells us how large of a $\Delta$ value we need to use in our zero sum, and thus how long the computation will take as a function of $E$.

If this perhaps sounds a bit handwavy at this point it's because this is all still a work in progress, to be published as part of my dissertation. Nevertheless, the bottom line is that bounding the leading $L$-series coefficient from below gives us a way to directly analyze the computational complexity of analytic rank methods. 

I hope to go into more detail in a future post about what we can do to bound from below the leading coefficient of $L_E$ at the central point, and why this is a far from straightforward endeavour.

by Simon Spicer ( at July 08, 2014 06:04 PM

July 06, 2014

David Horgan (quantumtetrahedron)

animated knot

This week I have been working on a variety of topics, but I’d like to post about some sagemath knot theory investigations Ive been doing.  This is connected to one of my longer term aims, which  is to  perform SU(2) recoupling calculations based on  Temperley-Lieb algebras using Sage Mathematics Software. Essentially I would like to to write a set of tools for doing SU(2) recoupling theory, Penrose binor calculus and  Temperley—Lieb algebra calculations using sagemath. 

Lets see what sagemath can do:

knots code

knot output1




animate knot

animated knot



by David Horgan at July 06, 2014 07:23 PM

Amit Jamadagni


Hello everyone,
The past week has been really on the thinking end. The task was to detect the braid word from the oriented gauss code. So I mentioned the dots were to be connected, but then I had problems implementing it. I was thinking more and more rather than a looking for the straight implementation. It took me a week to break through and then I had some consistent results.  So they were various ideas on how one could detect the braidword. One was to update the outgoing strands after each crossing has been looked into and then match it with were we have the incoming. I was working on various other ideas as well as I felt there was some kind of difficulty on the implementation end. Here are the gists of code on the ideas I worked upon.

The ones I attempted

The one which is working

This was like I had all the answers yet I was missing something. There are some glitches still though, small condition on the seifert circles one is missing. I need to work on the documentation end as we are missing on that. And here is the pull request for the past two weeks.

Lot of work as to be done on the documenting and testing side of the code. Here are the results for the algorithm

sage: from sage.knots import link
sage: L = link.Link(oriented_gauss_code = [[-1, +2, -3, 4, +5, +1, -2, +6, +7, 3, -4, -7, -6,-5],[‘-‘,’-‘,’-‘,’-‘,’+’,’-‘,’+’]])

Seifert Circles
sage: L.seifert_circles()
[[9, 13, 9], [4, 12, 10, 4], [2, 8, 14, 6, 2], [1, 7, 3, 11, 5, 1]]

PD Code
sage: L.PD_code()
[[1, 7, 2, 6],
[7, 3, 8, 2],
[3, 11, 4, 10],
[11, 5, 12, 4],
[14, 5, 1, 6],
[13, 9, 14, 8],
[12, 9, 13, 10]]

sage: L.regions()
[[4, -11],
[2, -7],
[6, -1],
[13, 9],
[-4, -10, -12],
[-8, -2, -6, -14],
[10, -3, 8, -13],
[14, -5, 12, -9],
[7, 3, 11, 5, 1]]

We perform a move if they are bad regions
sage: L.remove_regions()
[[1, 9, 2, 8],
[9, 3, 10, 2],
[5, 13, 6, 12],
[13, 7, 14, 6],
[18, 7, 1, 8],
[17, 11, 18, 10],
[14, 11, 15, 12],
[16, 4, 17, 3],
[15, 4, 16, 5]]

Final method gives the above information after all the moves are made
Seifert Circles
[[[4, 18, 4],
[15, 21, 15],
[1, 9, 3, 19, 11, 17, 5, 13, 7, 1],
[2, 10, 20, 16, 12, 6, 14, 22, 8, 2]],

[[-15, -21],
[6, -13],
[2, -9],
[8, -1],
[18, 4],
[-3, 10, -19],
[12, -5, -17],
[20, 16, -11],
[14, 22, -7],
[19, 11, 17, -4],
[21, -14, -6, -12, -16],
[15, -20, -10, -2, -8, -22],
[5, 13, 7, 1, 9, 3, -18]],

PD Code
[[1, 9, 2, 8],
[9, 3, 10, 2],
[5, 13, 6, 12],
[13, 7, 14, 6],
[22, 7, 1, 8],
[19, 11, 20, 10],
[16, 11, 17, 12],
[18, 4, 19, 3],
[17, 4, 18, 5],
[21, 14, 22, 15],
[20, 16, 21, 15]]]

And finally we convert it to braid
sage: L.seifert_to_braid()
[-1, -2, -3, 2, 1, -2, -2, 3, 2, -2, -2]

One more example :

sage: from sage.knots import link

sage: L = link.Link(oriented_gauss_code = [[-1, +2, 3, -4, 5, -6, 7, 8, -2, -5, +6, +1, -8, -3, -4, -7],[‘-‘,’-‘,’-‘,’-‘,’+’,’+’,’-‘,’+’]])

sage: L.seifert_circles()
[[2, 10, 6, 12, 2], [4, 16, 8, 14, 4], [1, 13, 9, 3, 15, 5, 11, 7, 1]]

sage: L.regions()
[[6, -11],
[15, -4],
[9, 3, -14],
[2, -9, -13],
[1, 13, -8],
[12, -1, -7],
[5, 11, 7, -16],
[-3, 10, -5, -15],
[-6, -10, -2, -12],
[16, 8, 14, 4]]

sage: L.PD_code()
[[1, 13, 2, 12],
[9, 3, 10, 2],
[14, 4, 15, 3],
[4, 16, 5, 15],
[10, 5, 11, 6],
[6, 11, 7, 12],
[16, 8, 1, 7],
[13, 8, 14, 9]]

sage: L.remove_regions()
‘No move required’

[[[2, 10, 6, 12, 2], [4, 16, 8, 14, 4], [1, 13, 9, 3, 15, 5, 11, 7, 1]],
[[6, -11],
[15, -4],
[9, 3, -14],
[2, -9, -13],
[1, 13, -8],
[12, -1, -7],
[5, 11, 7, -16],
[-3, 10, -5, -15],
[-6, -10, -2, -12],
[16, 8, 14, 4]],
[[1, 13, 2, 12],
[9, 3, 10, 2],
[14, 4, 15, 3],
[4, 16, 5, 15],
[10, 5, 11, 6],
[6, 11, 7, 12],
[16, 8, 1, 7],
[13, 8, 14, 9]]]

sage: L.seifert_to_braid()
[-1, 2, -1, -2, -2, 1, 1, -2]

So this week we are looking at jones polynomial implementation and lots of documentation and testing.Thanks for reading through.

by esornep at July 06, 2014 01:50 PM

Vince Knight

My CSV python video gets 10000 views

So the other day I got the following notification on my phone:

This both terrified me and made me smile. It’s nice to think that a bunch of people have (hopefully) been helped with what I put together but also slightly worrying as I think I would be able to put together a much better video if I did it now.

Here it is:

The basic code I use in the video is:

import csv

out = open('data.csv', 'r')  # Open a file in read mode
data = csv.reader(out)  # Initiate a csv reader object which will parse the data
data = [[row[0], eval(row[1]), eval(row[2])] for row in data]  # Read in the data and convert certain entries of each row
out.close()  # Close the file

new_data = [[row[0], row[1] + row[2]] for row in data]  # Create some new data

out = open('new_data.csv', 'w')  # Open a file in write mode
output = csv.writer(out)  # Initiate a csv writer object which will write the data in the correct format (csv)

for row in new_data:  # Loop through the data
    output.writerow(row)  # Write the row to file

out.close()  # Close the file

There are a couple of other videos that are getting near that landmark, this is possibly my favourite of them:

The above video shows how to simulate a basic queue in Python. My Summer student James Campbell and I are working on something related at the moment.

July 06, 2014 12:00 AM

July 02, 2014

Vince Knight

A gitignore file for Python (Sage) and LaTeX

The main tools I use on a day to day basis are Python/Sage and LaTeX. Since learning to love git and even more so github my usual workflow when starting a new project is something like this:

$ git init
$ vim
$ vim application_for_fields_medal.tex
$ latexmk --xelatex application_for_fields_medal.tex
$ git status

Once I run the git status that’s when I realise that I’ve now got a bunch of files I don’t want to commit .aux, .pyc etc…

I then normally throw whatever those files are to a .gitignore file. Despite teaching my students to never do anything twice and to always script whatever can be scripted: I have never actually put a .gitignore file together that has everything in it.

So here’s my first attempt.

This is basically a modification of the Python github .gitignore and a slight tweak of this gist (there doesn’t seem to be a bespoke LaTeX .gitignore as an option when you create a repository).

# LaTeX Files

# Byte-compiled / optimized / DLL files

# C extensions

# Distribution / packaging

# Installer logs

# Unit test / coverage reports

# Translations

# Mr Developer

# Rope

# Django stuff:

# Sphinx documentation

You can find this at this gist.

If anyone can suggest anything I’m missing it’d be great to hear it :)

July 02, 2014 12:00 AM

June 30, 2014

Amit Jamadagni


Hello everyone,
Last week we implemented the Vogel’s algorithm Part – 1, which would check for the bad regions and remove them. This week started off  tweaking the last week’s code. I was thinking that instead of performing a move and then looking for the bad regions, why not perform all the moves necessary to correct the bad regions in a single go. But this turned out to be a wrong approach as the seifert circles expected were not matching with the results we got. So we choose a bad region, perform a move and again see if there are bad regions, basically a while loop which captures this was implemented. If there are no bad regions we move onto the next part which is detecting the braidword of the knot. In the part 2, we use the following information from part 1 of the algorithm, the PD Code, seifert circles, regions with respect to the final knot after we have removed the necessary regions by performing the moves.  Now there are exactly two seifert circles which match with the regions, we select one of them and see with what seifert circle it shares a crossing and number the second seifert circle as 2 and so on and so forth.
For example here are the results we get :
sage: L = link.Link(oriented_gauss_code = [[-1, +2, -3, 4, +5, +1, -2, +6, +7, 3, -4, -7, -6,-5],[‘-‘,’-‘,’-‘,’-‘,’+’,’-‘,’+’]])

Final method gives out the final seifert circles, regions and the planar code of the modified knot. We see exactly two regions coincide with the seifert circles.

seifert circles
[[[18, 4],
[21, 15],
[9, 3, 19, 11, 17, 5, 13, 7, 1],
[10, 20, 16, 12, 6, 14, 22, 8, 2]]

[[-15, -21],
[6, -13],
[2, -9],
[8, -1],
[18, 4],
[-3, 10, -19],
[12, -5, -17],
[20, 16, -11],
[14, 22, -7],
[19, 11, 17, -4],
[21, -14, -6, -12, -16],
[15, -20, -10, -2, -8, -22],
[5, 13, 7, 1, 9, 3, -18]]

[[1, 9, 2, 8],
[9, 3, 10, 2],
[5, 13, 6, 12],
[13, 7, 14, 6],
[22, 7, 1, 8],
[19, 11, 20, 10],
[16, 11, 17, 12],
[18, 4, 19, 3],
[17, 4, 18, 5],
[21, 14, 22, 15],
[20, 16, 21, 15]]]

#this is still a method in development
sage: L.seifert_to_braid()
here we have numbered the seifert circles which form the strands in the braidword
{1: [18, 4], 2: [9, 3, 19, 11, 17, 5, 13, 7, 1], 3: [10, 20, 16, 12, 6, 14, 22, 8, 2], 4: [21, 15]}

#this is other information required for the ordering of the crossings
{1: [[18, 4, 19, 3], [17, 4, 18, 5]], 2: [[1, 9, 2, 8], [9, 3, 10, 2], [5, 13, 6, 12], [13, 7, 14, 6], [22, 7, 1, 8], [19, 11, 20, 10], [16, 11, 17, 12], [18, 4, 19, 3], [17, 4, 18, 5]], 3: [[1, 9, 2, 8], [9, 3, 10, 2], [5, 13, 6, 12], [13, 7, 14, 6], [22, 7, 1, 8], [19, 11, 20, 10], [16, 11, 17, 12], [21, 14, 22, 15], [20, 16, 21, 15]], 4: [[21, 14, 22, 15], [20, 16, 21, 15]]}
{1: [‘-‘, ‘+’], 2: [‘-‘, ‘-‘, ‘-‘, ‘-‘, ‘+’, ‘-‘, ‘+’, ‘-‘, ‘+’], 3: [‘-‘, ‘-‘, ‘-‘, ‘-‘, ‘+’, ‘-‘, ‘+’, ‘+’, ‘-‘], 4: [‘+’, ‘-‘]}
{1: [[4, 19], [18, 5]], 2: [[9, 2], [3, 10], [13, 6], [7, 14], [1, 8], [11, 20], [17, 12], [4, 19], [18, 5]], 3: [[9, 2], [3, 10], [13, 6], [7, 14], [1, 8], [11, 20], [17, 12], [22, 15], [16, 21]], 4: [[22, 15], [16, 21]]}
{1: [[18, 3], [17, 4]], 2: [[1, 8], [9, 2], [5, 12], [13, 6], [22, 7], [19, 10], [16, 11], [18, 3], [17, 4]], 3: [[1, 8], [9, 2], [5, 12], [13, 6], [22, 7], [19, 10], [16, 11], [21, 14], [20, 15]], 4: [[21, 14], [20, 15]]}

So the next and the final part is to trace through the ordering of the crossings and assign them with a braid generator and that generates the braidword.
The last part of ordering has been a tough part to generalize and code. It is like we have all the information but the dots are yet to be connected to get the final picture. I would like to leave you with two examples which we start from the first.

#so we start with the oriented gauss code
sage: L = link.Link(oriented_gauss_code = [[-1, +2, -3, 4, +5, +1, -2, +6, +7, 3, -4, -7, -6,-5],[‘-‘,’-‘,’-‘,’-‘,’+’,’-‘,’+’]])
#get the pd_code
sage: L.PD_code()
[[1, 7, 2, 6],
[7, 3, 8, 2],
[3, 11, 4, 10],
[11, 5, 12, 4],
[14, 5, 1, 6],
[13, 9, 14, 8],
[12, 9, 13, 10]]
#compute the regions
sage: L.regions()
[[4, -11],
[2, -7],
[6, -1],
[13, 9],
[-4, -10, -12],
[-8, -2, -6, -14],
[10, -3, 8, -13],
[14, -5, 12, -9],
[7, 3, 11, 5, 1]]
#compute the seifert circles
sage: L.seifert_circles()
[[13, 9], [12, 10, 4], [8, 14, 6, 2], [7, 3, 11, 5, 1]]
#remove the bad regions … this is just a method showing how the remove_regions works
#it returns a pd_code after the move has been made
sage: L.remove_regions()
[[1, 9, 2, 8],
[9, 3, 10, 2],
[5, 13, 6, 12],
[13, 7, 14, 6],
[18, 7, 1, 8],
[17, 11, 18, 10],
[14, 11, 15, 12],
[16, 4, 17, 3],
[15, 4, 16, 5]]

#the below method returns the seifert circles, regions and pd_code after the correction (we run the above method until we end up with these results)
[[[18, 4],
[21, 15],
[9, 3, 19, 11, 17, 5, 13, 7, 1],
[10, 20, 16, 12, 6, 14, 22, 8, 2]],
[[-15, -21],
[6, -13],
[2, -9],
[8, -1],
[18, 4],
[-3, 10, -19],
[12, -5, -17],
[20, 16, -11],
[14, 22, -7],
[19, 11, 17, -4],
[21, -14, -6, -12, -16],
[15, -20, -10, -2, -8, -22],
[5, 13, 7, 1, 9, 3, -18]],
[[1, 9, 2, 8],
[9, 3, 10, 2],
[5, 13, 6, 12],
[13, 7, 14, 6],
[22, 7, 1, 8],
[19, 11, 20, 10],
[16, 11, 17, 12],
[18, 4, 19, 3],
[17, 4, 18, 5],
[21, 14, 22, 15],
[20, 16, 21, 15]]]

#here we have numbered the seifert circles which form the strands in the braidword
sage: L.seifert_to_braid()
{1: [18, 4], 2: [9, 3, 19, 11, 17, 5, 13, 7, 1], 3: [10, 20, 16, 12, 6, 14, 22, 8, 2], 4: [21, 15]}

The ordering of the crossing remain …. which would complete the algorithm

Another example :

sage: L = link.Link(oriented_gauss_code = [[-1, +2, 3, -4, 5, -6, 7, 8, -2, -5, +6, +1, -8, -3, -4, -7],[‘-‘,’-‘,’-‘,’-‘,’+’,’+’,’-‘,’+’]])

sage: L.PD_code()
[[1, 13, 2, 12],
[9, 3, 10, 2],
[14, 4, 15, 3],
[4, 16, 5, 15],
[10, 5, 11, 6],
[6, 11, 7, 12],
[16, 8, 1, 7],
[13, 8, 14, 9]]
sage: L.regions()
[[6, -11],
[15, -4],
[9, 3, -14],
[2, -9, -13],
[1, 13, -8],
[12, -1, -7],
[5, 11, 7, -16],
[-3, 10, -5, -15],
[-6, -10, -2, -12],
[16, 8, 14, 4]]

sage: L.seifert_circles()
[[10, 6, 12, 2], [16, 8, 14, 4], [13, 9, 3, 15, 5, 11, 7, 1]]

sage: L.remove_regions()
‘No move required’
[[[10, 6, 12, 2], [16, 8, 14, 4], [13, 9, 3, 15, 5, 11, 7, 1]],
[[6, -11],
[15, -4],
[9, 3, -14],
[2, -9, -13],
[1, 13, -8],
[12, -1, -7],
[5, 11, 7, -16],
[-3, 10, -5, -15],
[-6, -10, -2, -12],
[16, 8, 14, 4]],

[[1, 13, 2, 12],

[9, 3, 10, 2],
[14, 4, 15, 3],
[4, 16, 5, 15],
[10, 5, 11, 6],
[6, 11, 7, 12],
[16, 8, 1, 7],
[13, 8, 14, 9]]]

sage: L.seifert_to_braid()
#we number the strands
{1: [10, 6, 12, 2], 2: [13, 9, 3, 15, 5, 11, 7, 1], 3: [16, 8, 14, 4]}

Other information for extracting the braidword
{1: [[1, 13, 2, 12], [9, 3, 10, 2], [10, 5, 11, 6], [6, 11, 7, 12]], 2: [[1, 13, 2, 12], [9, 3, 10, 2], [14, 4, 15, 3], [4, 16, 5, 15], [10, 5, 11, 6], [6, 11, 7, 12], [16, 8, 1, 7], [13, 8, 14, 9]], 3: [[14, 4, 15, 3], [4, 16, 5, 15], [16, 8, 1, 7], [13, 8, 14, 9]]}
{1: [‘-‘, ‘-‘, ‘+’, ‘+’], 2: [‘-‘, ‘-‘, ‘-‘, ‘-‘, ‘+’, ‘+’, ‘-‘, ‘+’], 3: [‘-‘, ‘-‘, ‘-‘, ‘+’]}
{1: [[13, 2], [3, 10], [11, 6], [7, 12]], 2: [[13, 2], [3, 10], [4, 15], [16, 5], [11, 6], [7, 12], [8, 1], [14, 9]], 3: [[4, 15], [16, 5], [8, 1], [14, 9]]}
{1: [[1, 12], [9, 2], [10, 5], [6, 11]], 2: [[1, 12], [9, 2], [14, 3], [4, 15], [10, 5], [6, 11], [16, 7], [13, 8]], 3: [[14, 3], [4, 15], [16, 7], [13, 8]]}
{1: [[9, 3, 10, 2], [10, 5, 11, 6], [6, 11, 7, 12]], 2: [[1, 13, 2, 12], [9, 3, 10, 2], [14, 4, 15, 3], [4, 16, 5, 15], [10, 5, 11, 6], [6, 11, 7, 12], [16, 8, 1, 7], [13, 8, 14, 9]], 3: [[14, 4, 15, 3], [4, 16, 5, 15], [16, 8, 1, 7], [13, 8, 14, 9]]}
[[13, 2], [3, 10], None]

So to conclude, I hope I can say we are almost able to see the result and almost there but as said the dots are yet to be connected. The implementation of the last part remains.

by esornep at June 30, 2014 02:01 PM

June 28, 2014

Nikhil Peter

GSoC – Post Mid-Term Update

Well, Mid-Terms have just finished(I passed!), and I’ve been neglecting this blog so here are some updates. The app is now in beta! More info can be found at the sage-devel post here Most of the work done is internal, but there are a few external fixes and enhancements as well. Agendas for the next […]

by hav3n at June 28, 2014 05:41 PM

June 27, 2014

Vince Knight

Pointing at a blog post about Sage and a treasure hunt

It looks like I might be moving... I've been playing with jekyll + github and really love the workflow so it looks like I might be moving this blog over there.

It's not official yet as I still need to think about a couple of things but in the meantime here's a post I just wrote over there about how the team I was in used Sage in a mathematical treasure hunt:

by Vincent Knight ( at June 27, 2014 12:10 PM

Vince Knight

Using Sage in a Treasure hunt

Here is my first post using jekyll, I’ll see how this goes and potentially migrate my blog completely to here.

This past year I’ve had a very minor role helping to organise the EURO Summer Institute for Healthcare. This took part from June 11 to 20 and it was really cool with 20 (ish) young academics (from first year PhD student to early years of Post Doc) coming together to present a paper they’re working on and get bespoke feedback from peers and an expert in the field.

The academic activities are really valuable but arguably of more value is the fact that everyone comes together and has a good time. One such ‘good time’ was a treasure hunt that because of some people dropping out of I was lucky enough to take part in. This was really great fun and in this post I’ll describe the problems as well as the Sage code my team used to almost win!

Here’s a photo that Paul Harper took of Pieter, Omar, Jenny and I huddled around Sage:

The treasure hunt involved us running around the small village to find an envelope with a particular mathematical problem. Once we’d solved that problem we would be given an indication as to where our next problem was.

Here are the various challenges (again: thanks to Paul for taking the photos!):

A clinical pathway problem (we did this one by hand):

A queueing problem:

To solve the above we wrote down the Markov chain (that’s actually the picture that Pieter is working on in the first photo of this blog post). Here’s the transition matrix corresponding to the Markov Chain:

\[\begin{pmatrix} -\frac{4_5}{45} & \frac{4}{45} & 0 & 0 & 0 \\
\frac{1}{20} & -\frac{5}{36} & \frac{1}{45} & \frac{1}{15} & 0 \\
0 & \frac{1}{10} & -\frac{1}{6} & 0 & \frac{1}{15} \\
0 & \frac{1}{20} & 0 & -\frac{13}{180} & \frac{1}{45} \\
0 & 0 & 0 & \frac{1}{10} & -\frac{1}{10} \end{pmatrix}\]

At the very beginning of the treasure hunt the organisers encouraged everyone to go and get their laptops. We originally thought that we’d manage without but at this point I ran to my room to go get my laptop :)

Here’s how we obtained the steady state probability \(\pi\):

Q = matrix([[-4/45,4/45,0,0,0,1],
pi = Q.solve_left(vector([0,0,0,0,0,1]))

The above solves the matrix equation \(\pi Q=0\) with an extra column in \(Q\) to ensure that the elements of \(\pi\) sum to 1.

Finally, the particular questions asked for the following weighted sum (corresponding the mean utilisation):


This gave a value of about \(0.499\).

I’m obviously missing out a couple of details (the actually description of the state space) but I’ll leave that as an exercise.

A chinese postman problem:

I’m not entirely sure what we did here as Pieter kind of ran with this one but at some point I was asked to calculate a matching on our graph. Here’s the Sage code:

sage: K = 500  # The inbuilt matching algorithm aim to maximise: we wanted to minimise so I threw a large constant in...
sage: G = Graph( { 1: {2: -23+K, 5:-20+K, 8:-19+K},
....:              2: {1: -23+K, 3:-8+K, 5:-18+K},
....:              3: {2: -8+K, 4:-5+K, 5:-16+K},
....:              4: {3: -5+K, 5:-14+K, 7:-7+K, 10:-21+K},
....:              5: {1: -20+K, 2:-18+K, 3:-16+K, 4:-14+K, 6:-1+K},
....:              6: {5: -1+K, 7:-8+K, 8:-20+K, 9:-12+K, 10:-17+K},
....:              7: {4: -7+K, 6:-8+K},
....:              8: {1: -19+K, 6:-20+K, 9:-15+K, 11:-20+K},
....:              9: {6:-12+K, 8:-15+K, 10:-14+K, 11:-12+K},
....:              10: {4:-21+K, 6:-17+K, 9:-14+K, 11:-18+K},
....:              11: {8:-20+K, 9:-12+K, 10:-18+K},
....:               }, sparse = True)
sage: G.matching()
[(1, 8, 481), (2, 3, 492), (4, 7, 493), (5, 6, 499), (9, 11, 488)]

I’m not entirely sure that’s the right Graph I was supposed to use but it’s what I have left over in my Sage notebook…

A packing problem:

I had remembered seeing that packing problems were implemented in Sage so as we were running back from collecting the clue I yelled: this will take 3 minutes!

Sadly, this wasn’t the case as the only implementation involves bins of the same size (which isn’t the case here). As I read around the docs the rest of my team was able to solve this heuristically which left us with the last challenge and at this point nicely in the lead!

The problem of doom that made everyone cry:

After having pulled in the data here is about as far as Sage got us:

p = list_plot(healthy, color='red', size=30)
p += list_plot(pathological, color='green',size=30)

This was a tricky problem. We had no wifi in our hotel so downloading a relevant R package was out of the question.

We eventually formulated a non linear integer program but sadly our solver didn’t seem able to handle it in time. With two minutes to go (the deadline was dinner) one of the teams who had been doing things very steadily ran over claiming that they might have a solution. Everyone went quiet and walked over as their solution was checked and verified. It was a really nice moment as everyone clapped and cheered. Here’s a picture of a bunch of us crowded around the solution understanding how they took a stab at fixing some of the variables so that the solver could get the solution.

This was a phenomenal piece of fun. You can find another description with a bunch of funny photos of us all struggling here.

If anyone has some suggestions to how we could have solved any problems any faster I’d be delighted to hear them :)

June 27, 2014 12:00 AM

June 26, 2014

Simon Spicer

Fun with elliptic curve L-function explicit formulae

Although I gave a brief exposition of the "baseline" explicit formula for the elliptic curve $L$-functions in a previous post, I wanted to spend some more time showing how one can use this formula to prove some useful statements about the elliptic curves and their $L$-functions. Some of these are already implemented in some way in the code I am building for my GSoC project, and I hope to include more as time goes on.

First up, let's fix some notation. For the duration of this post we fix an elliptic curve $E$ over the rationals with conductor $N$. Let $L_E(s)$ be its associated $L$-function, and let $\Lambda_E(s)$ be the completed $L$-function thereof, i.e.
$\Lambda_E(s) = N^{s/2}(2\pi)^s\Gamma(s)L_E(s)$, where $\Gamma$ is the usual Gamma function on $\mathbb{C}$. If you want to know more about how $E$, $L_E$ and $\Lambda_E$ are defined and how we compute with them, some of my previous posts (here and here) go into their exposition in more detail.

Let's recap briefly how we derived the baseline explicit formula for the $L$-function of $E$ (see this post for more background thereon). Taking logarithmic derivatives of the above formula for $\Lambda_E(s)$ and shifting to the left by one gives us the following equality:
$$\frac{\Lambda_E^{\prime}}{\Lambda_E}(1+s) = \log\left(\frac{\sqrt{N}}{2\pi}\right) + \frac{\Gamma^{\prime}}{\Gamma}(1+s) + \frac{L_E^{\prime}}{L_E}(1+s).$$
Nothing magic here yet. However, $\frac{\Lambda_E^{\prime}}{\Lambda_E}$, $\frac{\Gamma^{\prime}}{\Gamma}$ and $\frac{L_E^{\prime}}{L_E}$ all have particularly nice series expansions about the central point. We have:

  • $\frac{\Lambda_E^{\prime}}{\Lambda_E}(1+s) = \sum_{\gamma} \frac{s}{s^2+\gamma^2}$, where $\gamma$ ranges over the imaginary parts of the zeros of $L_E$ on the critical line; this converges for any $s$ not equal to a zero of $L_E$.
  • $\frac{\Gamma^{\prime}}{\Gamma}(1+s) = -\eta + \sum_{k=1}^{\infty} \frac{s}{k(k+s)}$, where $\eta$ is the Euler-Mascheroni constant $=0.5772156649\ldots$ (this constant is usually denoted by the symbol $\gamma$ - but we'll be using that symbol for something else soon enough!); this sum converges for all $s$ not equal to a negative integer.
  • $\frac{L_E^{\prime}}{L_E}(1+s) = \sum_{n=1}^{\infty} c_n n^{-s}$; this only converges absolutely in the right half plane $\Re(s)>\frac{1}{2}$.

$$ c_n = \begin{cases}
\left[p^m+1-\#E(\mathbb{F}_{p^m})\right]\frac{\log p}{p^m}, & n = p^m \mbox{ is a perfect prime power,} \\
0 & \mbox{otherwise.}\end{cases} $$

Assembling these equalities gives us the aforementioned explicit formula:
$$ \sum_{\gamma} \frac{s}{s^2+\gamma^2} = \left[-\eta+\log\left(\frac{\sqrt{N}}{2\pi}\right)\right] + \sum_{k=1}^{\infty} \frac{s}{k(k+s)} + \sum_n c_n n^{-s}$$
which holds for any $s$ where all three series converge. It is this formula which we will use repeatedly to plumb the nature of $E$.

For ease of exposition we'll denote the term in the square brackets $C_0$. It pitches up a lot in the math below, and it's a pain to keep writing out!

Some things to note:

  • $C_0 = -\eta+\log\left(\frac{\sqrt{N}}{2\pi}\right)$ is easily computable (assuming you know $N$). Importantly, this constant depends only on the conductor of $E$; it contains no other arithmetic information about the curve, nor does it depend in any way on the complex input $s$.
  • The sum $\sum_{k=1}^{\infty} \frac{s}{k(k+s)}$ doesn't depend on the curve at all. As such, when it comes to computing values associated to this sum we can just hard-code the computation before we even get to working with the curve itself.
  • The coefficients $c_n$ can computed by counting the number of points on $E$ over finite fields up to some bound. This is quick to do for any particular prime power.

Good news: the code I've written can compute all the above values quickly and efficiently:

sage: E = EllipticCurve([-12,29])
sage: Z = LFunctionZeroSum(E)
sage: N = E.conductor()
sage: print(Z.C0(),RDF(-euler_gamma+log(sqrt(N)/(2*pi))))
(2.0131665172, 2.0131665172)
sage: print(Z.digamma(3.5),RDF(psi(3.5)))
(1.10315664065, 1.10315664065)
sage: Z.digamma(3.5,include_constant_term=False)
sage: timeit('')
625 loops, best of 3: 238 µs per loop

So by computing values on the right we can compute the sum on the left - without having to know the exact locations of the zeros $\gamma$, which in general is hard to compute.

Now that we have this formula in the bag, let's put it to work.


If we multiply the sum over the zeros by $s$ and letting $\Delta = 1/s$, we get
$$\sum_{\gamma} \frac{\Delta^{-2}}{\Delta^{-2}+\gamma^2} = \sum_{\gamma} \frac{1}{1+(\Delta\gamma)^2},$$
Note that for large values of $\Delta$, $\frac{1}{1+(\Delta\gamma)^2}$ is small but strictly positive for all nonzero $\gamma$, and 1 for the central zeros, which have $\gamma=0$. Thus the value of the sum when $\Delta$ is large gives a close upper bound on the analytic rank $r$ of $E$. That is, we can bound the rank of $E$ from above by choosing a suitably large value of $\Delta$ and computing the quantity on the right in the inequality below:
$$r < \sum_{\gamma} \frac{1}{1+(\Delta\gamma)^2} = \frac{1}{\Delta}\left[C_0 + \sum_{k=1}^{\infty} \frac{1}{k(1+\Delta k)} + \sum_n c_n n^{-1/\Delta}\right]. $$
Great! Right? Wrong. In practice this approach is not a very good one. The reason is the infinite sum over $n$ only converges absolutely for $\Delta<2$, and for Delta values as small as this, the obtained bound won't be very good. A value of $\Delta=2$, for example, gives us the zero sum $\sum_{\gamma} \frac{1}{1+(2\gamma)^2}$. If a curve's $L$-function has a zero with imaginary part about 0.5, for example - as many $L$-functions do - then such a zero will contribute 0.5 to the sum. And because zeros come in symmetric pairs, the sum value will then be at least a whole unit larger than the actual analytic rank. In general, for $\Delta<2$ the computed sum can be quite a bit bigger than the actual analytic rank of $E$.

Moreover, even though the Generalized Riemann Hypothesis predicts that the sum $\sum_n c_n n^{-1/\Delta}$ does indeed converge for any positive value of $\Delta$, in practice the convergence is so slow that we end up needing to compute inordinate amounts of the $c_n$ in order to hope to get a good approximation of the sum. So no dice; this approach is just too inefficient to be practical.

A graphic depiction of the convergence problems we run into when trying to evaluate the sum over the $c_n$. For the elliptic curve $E: y^2 + y = x^3 - 79*x + 342$ the above plot shows the cumulative sum $\sum_{n<T} c_n n^{-1/4}$ for $T$ up to 100000; this corresponds to $\Delta=4$. Note that even when we include this many terms in the sum, its value still varies considerably. It's unlikely the final computed value is correct to a single decimal digit yet.


We can get around this impasse by evaluating a modified sum over the zeros, one which requires us to only ever need to compute finitely many of the $c_n$. Here's how we do it. Start with the sum $\sum_{\gamma} \frac{s}{s^2+\gamma^2}$, and divide by $s^2$ to get $\frac{1}{s(s^2+\gamma^2)}$. Now hearken back to your college sophomore math classes and take the inverse Laplace transform of this sum. We get
$$ \mathcal{L}^{-1} \left[\sum_{\gamma} \frac{1}{s(s^2+\gamma^2)}\right](t) = \sum_{\gamma} \mathcal{L}^{-1} \left[\frac{1}{s(s^2+\gamma^2)}\right](t) = \frac{t^2}{2} \sum_{\gamma} \left(\frac{\sin(\frac{t}{2}\gamma)}{\frac{t}{2}\gamma}\right)^2 $$
Letting $\Delta = \frac{t}{2\pi}$ we get the sum
$$ \mathcal{L}^{-1} \left[\sum_{\gamma} \frac{s}{s^2+\gamma^2}\right](\Delta) = 2\pi^2\Delta^2 \sum_{\gamma} \text{sinc}^2(\Delta\gamma), $$
where $\text{sinc}(x) = \frac{\sin(\pi x)}{\pi x}$, and $\text{sinc}(0) = 1$.

Note that as before, $\text{sinc}^2(\Delta\gamma)$ exaluates to 1 for the central zeros, and is small but positive for all nonzero $\gamma$ when $\Delta$ is large. So again, this sum will give an upper bound on the analytic rank of $E$, and that this bound converges to the true analytic rank as $\Delta \to \infty$.

If we do the same - divide by $s^2$ and take inverse Laplace transforms - to the quantities on the right, we get the following:

  • $\mathcal{L}^{-1} \left[\frac{C_0}{s^2} \right] = 2\pi\Delta C_0;$
  • $ \mathcal{L}^{-1} \left[\sum_{k=1}^{\infty} \frac{1}{sk(k+s)}\right] = \sum_{k=1}^{\infty} \frac{1}{k^2}\left(1-e^{-2\pi\Delta k}\right);$
  • $\mathcal{L}^{-1} \left[ \sum_{n=1}^{\infty} c_n \frac{n^{-s}}{s^2} \right] = \sum_{\log n<2\pi\Delta} c_n\cdot(2\pi\Delta - \log n). $
Note that the last sum is finite: for any given value of $\Delta$, we only need to compute the $c_n$ for $n$ up to $e^{2\pi\Delta}$. This is the great advantage here: we can compute the sum exactly without introducing any truncation error. The other two quantities are also readly computable to any precision.

Combining the above values and dividing by $2\pi^2\Delta^2$, we arrive at the rank bounding formula which is implemented in my GSoC code, hideous as it is:
r < \sum_{\gamma} \text{sinc}^2(\Delta\gamma) =  \frac{1}{\pi\Delta} &\left[ C_0 + \frac{1}{2\pi\Delta}\sum_{k=1}^{\infty} \frac{1}{k^2}\left(1-e^{-2\pi\Delta k}\right) \right. \\
&\left. + \sum_{n<e^{2\pi\Delta}} c_n\left(1 - \frac{\log n}{2\pi\Delta}\right) \right] \end{align*}
Of course, the number of terms needed on the right hand side is still exponential in $\Delta$, so this limits the tightness of the sum we can compute on the left hand side. In practice a personal computer can compute the sum with $\Delta=2$ in a few seconds, and a more powerful cluster can handle $\Delta=3$ in a similar time. Beyond Delta values of about 4, however, the number of $c_n$ is just too great to make the sum effectively computable.


Explicit formula-type sums can also be used to answer questions about the distribution of zeros of $L_E$ along the critical line. Let $T$ be a positive real number, and $M_E(T)$ be the counting function that gives the number of zeros in the critical strip with imaginary part at most $T$ in magnitude. By convention, when $T$ coincides with a zero of $L_E$, then we count that zero with weight. In other words, we can write
$$ M_E(T) = \sum_{|\gamma|<T} 1 + \sum_{|\gamma|=T} 1/2.$$
There is a more mathematically elegant way to write this sum. Let $\theta(x)$ be the Heaviside function on $x$ given by
$$ \theta(x) = \begin{cases} 0, & x<0 \\
\frac{1}{2}, & x=0 \\
1, & x>0. \end{cases}$$
Then we have that
$$M_E(T) = \sum_{\gamma} \theta(T^2-\gamma^2). $$
We have written $M_E(T)$ as a sum over the zeros of $L_E$, so we expect that it comprises the left-hand-side of some explicit formula-type sum. This is indeed this the case. Using Fourier transforms we can show that
$$M_E(T) = \frac{2}{\pi}\left[C_0 T + \sum_{k=1}^{\infty} \left(\frac{T}{k} - \arctan \frac{T}{k}\right) + \sum_{n=1}^{\infty} \frac{c_n}{\log n}\cdot \sin(T\log n) \right] $$

With this formula in hand we can start asking questions on how the zeros of $L_E$ are distributed. For example, how does average zero density on the critical line scale as a function of $T$, the imaginary part of the area in the critical strip we're considering? How does zero density scale with the curve's conductor $N$?

If one assumes the Generalized Riemann Hypothesis, we can show that $\sum_{n=1}^{\infty} \frac{c_n}{\log n}\cdot \sin(T\log n) = O(\log T)$. Moreover, this sum is in a sense equidistributed about zero, so we expect its contribution for a sufficiently large randomly chosen value of $T$ to be zero. The other two quantities are more straightforward. The $C_0 T$ term is clearly linear in $T$. Going back to the definition of $C_0$, we see that $C_0 T = \frac{1}{2}T\log N + $constant$\cdot T$. Finally, we can show that the sum over $k$ equals $T\log T + O(\log T)$. Combining this gives us the estimate
$$ M_E(T) = \frac{1}{\pi} T (\log N + 2\log T+a) + O(\log T) $$
for some explicitly computable constant $a$ independent of $E$, $N$ or $T$ . That is, the number of zeros with imaginary part at most $T$ in magnitude is "very close" to $\frac{1}{\pi}T(\log N + 2\log T)$. Another way to put it is than the number of zeros in the interval $[T,T+1]$ is about $\frac{1}{2}\log N + \log T$.

The value of $M_E(T)$ for $T$ up to 30, for the elliptic curve $E: y^2 + y = x^3 - x$. The black line is just the 'smooth part' of $M_E(T)$, given by $\frac{2}{\pi}\left(C_0 T + \sum_{k=1}^{\infty} \left(\frac{T}{k} - \arctan \frac{T}{k}\right)\right)$. This gives us the 'expected number of zeros up to $T$', before we know any data about $E$ other than its conductor. The blue line is what we get when we add in $\sum_{n=1}^{\infty} \frac{c_n}{\log n}\cdot \sin(T\log n)$, which tells us the exact locations of the zeros: they will be found at the places where the blue curve jumps by 2.

Note that the sum over $n$ converges *very* slowly. To produce the above plot, I included terms up to $n=1 000 000$, and there is still considerable rounding visible in the blue curve. If I could some how evaluate the $c_n$ sum in all its infinite glory, then resulting plot would be perfectly sharp, comprised of flat lines that jump vertically by 2 at the loci of the zeros.

These are two of the uses of explicit formula-type sums in the context of elliptic curve $L$-functions. If you want to read more about them, feel free to pick up my PhD thesis - when it eventually gets published!

by Simon Spicer ( at June 26, 2014 04:23 PM

June 23, 2014

Amit Jamadagni

Screenshot from 2014-06-23 21:09:51

Hello everyone,
The past week has been a great learning curve for me. So let me take you back to where we stopped the last week. So given a oriented gauss code we were able to generate the planar diagram code. Now this week, we start with this planar diagram code and generate the Seifert circles, the regions of the knot, the bad regions and performing the Reidemeister move to correct the bad regions. So let me start by explaining what everything means.
Seifert circles : The regions obtained by smoothing of the crossing.


So in the above crossing t(i) (under cross, entering) goes to t(j) (over cross, leaving) and similarly we have t(j) (over cross, entering) goes to t(i+1) (under crossm leaving)
So we have the following example which shows such kind of a smoothing.
Regions of the knot: For the Seifert circles we moved in a specific direction at every crossing but here we only turn left at every crossing and if we move against the component we note it down with a negative sign.

Screenshot from 2014-06-23 20:45:08

So we have the regions and the seifert circles, now when do we call a region bad, a bad region is one that has two edges(two components) with the same sign, but that belong to different Seifert circles. So now we have the information of everything so we perform the corresponding Reidmeister move to correct the bad regions.

So if we start numbering the components, we eventually end up with the Planar Diagram code, from which we can make out the bad regions and perform the move.
Screenshot from 2014-06-23 21:09:51

So that was just a synopsis of the algorithm we implemented.

So first we generate the Seifert circles, from the PD code (which had all the information of the entering , leaving, + , – , under and over) this was straight forward as we had to match the formula which was mentioned above.

To compute the regions, we first had to turn left at every crossing. So we have arrays of data and we had to map one onto the other to get the regions.

Now we have the Seifert circles and bad regions so we start to look for the bad regions, this was also a bit straight forward as we just had to implement the formula for the bad regions. Now once- the bad regions were recognized we performed the move and renumbering the components was a simple logic of adding multiples of 2 to the respective edges. Then, once this was done we had to renumber the new crossings, the logic was to map the new data to the old data and to compute the numbering we had to use the logic that the greatest among the leaving component will have the lowest component entering. So that setup converted the bad to good regions. Now once this is done we need to compute the Seifert circles once again in order to get to the braid word which is the goal of the algorithm. So till now we have identified the Seifert circles, the regions, the bad regions, corrected them and computed the Seifert circles once again. So the next part of the algorithm is to read the braidword from the Seifert circles obtained.

Here is the gist of the implementation of PD Code, Seifert circles, regions and _is_move_required. A lot of refining has to be done, we just got the bare essentials working. The idea was to first get it going and then optimize as we move along. So that’s it from me, hope we conclude the implementation this week.

Here is the entire file we are working on :

And here is the pull request for the week :

The images have been taken from this document and this document has also been followed to a good extent to get a better understanding :

PS : There has been some edits to the code, will post the remaining part as well as the edited version sooner.

by esornep at June 23, 2014 03:55 PM

June 19, 2014

Simon Spicer

The structure of my GSoC code and a short demo

I'm at the point where I can explain the layout of the code that I've written so far, and even demonstrate some of its functionality.

As mentioned in previous posts, the initial goal of my Google Summer of Code project is to implement in Sage functionality that will allow us to bound the analytic rank of an elliptic curve $E$ via certain explicit formula-type sums relating to the $L$-function of $E$. One could certainly achieve this by simply writing a method on the existing elliptic curve class in Sage. However, the machinery we'll be using to bound rank is useful outside of the context of elliptic curves: it allows us to compute  sums over zeros of any $L$-function that comes from a modular form. In fact, so long as

  1. The $L$-function has an Euler product expansion over the primes;
  2. The Euler product coefficients - akin to the $a_p$ values we've mentioned previously - are readily computible; and
  3. The $L$-function obeys a functional equation which allows us to define and compute with the function inside of the critical strip;

then we can use this explicit-formula methodology to compute sums over the zeros of the $L$-function.

All the code is hosted on GitHub, so you can view it freely here (note: this repository is a branch of the main Sage repo, so it contains an entire copy of the Sage source. I'm only adding a tiny fraction of code in a few files to this codebase; I'll link to these files directly below).

To this effect, it makes sense to write a new class in Sage that can ultimately be made to compute zero sums for any motivic $L$-function. This is what I'll be doing. I've created a file in the sage/lfunctions/ directory, inside of which I'll write an family of classes that take as input a motivic object, and contain methods to compute various sums and values related to the $L$-function of that object.

I'll start by writing two classes: LFunctionZeroSum_abstract, which will contain all the general methods that can apply to a zero sum estimator attached to any motivic $L$-function. The second class, LFunctionZeroSum_EllipticCurve, inherits from the abstract class, and contains the code specific to elliptic curve $L$-functions.

I have also added a method to the EllipticCurve_rational_field class in the sage/schemes/elliptic_curves/ file (the class modeling elliptic curves over the rationals) to access the analytic rank estimation functionality of the zero sum estimator.

Let's see the code in action. To download a copy yourself, say from within a project in, open up a terminal and type

~$ git clone git://

This will download the code in the repo into a new folder GSoC_2014 in your current directory. CD into that directory, type make and hit enter to build. This will unfortunately take a couple hours to complete, but the good new is you'll only need to do that once. Alternatively, if you have an existing freshly-built copy of the sage source and good github-fu, you should be able to download just the relevant code and apply it, greatly speeding up the build time.

Once your copy of sage containing the new LFunctionZeroSum code is built, fire up that copy of sage (e.g. type ./sage while in the GSoC_2014 directory; don't just type sage, as this runs the system-wide copy of sage - not the one we want). This command-line interface will have all the functionality of any other current copy of Sage, but with the extra methods and classes I've written.

Let's start by estimating the rank of some elliptic curves:

sage: E = EllipticCurve([0,1,1,-2,0]); E
Elliptic Curve defined by y^2 + y = x^3 + x^2 - 2*x
over Rational Field
sage: E.rank(), E.conductor()
(2, 389)
sage: E.analytic_rank_bound()

In the code above, I've defined an elliptic curve given by the Weierstrass equation $y^2 +y = x^3+x^2-2x$; for those of you in the know, this is the curve with Cremona label '389a', the rank 2 curve with smallest conductor. I've then printed out the rank of the curve, which is indeed 2. Then I've called my analytic_rank_bound() method on $E$, and it returns a value which is strictly greater than the analytic rank of $E$. Since the analytic rank of an elliptic curve is always a non-negative integer, we can conclude that the analytic rank of $E$ is at most 2.

How long did this rank estimation computation take? Thankfully Sage has a timing function to answer this question:

sage: timeit('E.analytic_rank_bound()')
125 loops, best of 3: 3.33 ms per loop

Neat; that was quick. However, this elliptic curve has a small conductor - 389, hence it's Cremona label - so computing with its $L$-function is easy. The great thing with the zero sum code is that you can run it on curves with really large conductors. Here we reproduce the commands above on the curve $y^2 + y = x^3 - 23737x + 960366$, which has rank 8 and conductor $N \sim 4.6\times 10^{14}$:

sage: E = EllipticCurve([0,0,1,-23737,960366]); E
Elliptic Curve defined by y^2 + y = x^3 - 23737*x + 960366 
over Rational Field
sage: E.rank(), E.conductor()
(8, 457532830151317)
sage: E.analytic_rank_bound()
sage: timeit('E.analytic_rank_bound()')
125 loops, best of 3: 3.37 ms per loop

So we see then that this zero sum code doesn't care very much about the size of the conductor of $E$; this is one of its huge plusses. There are of downsides, of course - but we'll save those for another post.

Now let's look at the LFunctionZeroSum code.

sage: E = EllipticCurve([23,-100])
sage: Z = LFunctionZeroSum(E); Z
Zero sum estimator for L-function attached to 
Elliptic Curve defined by y^2 = x^3 + 23*x - 100 
over Rational Field

Right now we don't have too much beyond that which you can access from the EllipticCurve class. For one, we can compute the coefficients of the logarithmic derivative, since they are needed to compute the rank-estimating zero sum. We can also return and/or compute one or two other values associated to the zero sum class. As time goes on we'll flesh out more functionality for this class.

sage: for n in range(1,11):
....:     print(n,Z.logarithmic_derivative_coefficient(n))
(1, 0)
(2, 0.0)
(3, -1.09861228867)
(4, 0.0)
(5, 1.28755032995)
(6, 0)
(7, -0.277987164151)
(8, 0.0)
(9, -0.366204096223)
(10, 0)
sage: Z.elliptic_curve()
Elliptic Curve defined by y^2 = x^3 + 23*x - 100
over Rational Field
sage: Z.elliptic_curve().rank()
sage: Z.rankbound(Delta=1.7,function="gaussian")

All of this code is of course written to Sage spec. That means that every method written has documentation, including expected input, output, examples, and notes/warnings/references where necessary. You can access the docstring for a method from the command line by typing that method followed by a question mark and hitting enter.

All my code will initially be written in Python. Our first goal is to replicate the functionality that has already been written by Jonathan Bober in c. As such we're not aiming for speed here; instead the focus is to make sure all the code is mathematically correct.

Once that's up and running there are two natural directions to take the project:

  1. Speed the code up. Rewrite key parts in Cython, so that computations run faster, and thus analytic rank can be estimated for curves of larger conductor.
  2. Generalize the code. Write classes that model zero sum estimators for $L$-functions attached to eigenforms (those modular forms with Euler product expansions) of any level and weight, and allow for more types of sums to be computed. Ultimately write code that can be used to compute sums for any motivic $L$-function.

I'm going to attack speeding up the code first, since the elliptic curve rank estimation functionality will probably see the most use. If there's time at the end of the project I can look to generalizing the code as much as I can.

by Simon Spicer ( at June 19, 2014 02:48 PM

The explicit formula and bounding analytic rank from above

Wednesday's post gave some detail on how one can numerically evaluate $L_E(s)$, the $L$-function attached to $E$, at any given complex input $s$. This is certainly the first step in determining the analytic rank of $E$ i.e. the order of vanishing of the $L_E(s)$ at the central point $s=1$.

However, we immediately run into two major issues. The first, as mentioned previously, is that we have no proven lower bound on the size of the coefficients of the Taylor expansion of $L_E(s)$ at $s=1$, so we can never truly ascertain if the $n$th derivative there vanishes, or is just undetectibly small.

Today's post focuses on a method that at least partially overcomes the second issue: that evaluating $L_E(s)$ and its derivatives takes at least $O(\sqrt{N})$ time, where $N$ is the conductor of $E$. This means that working with elliptic curve $L$-functions directly becomes computationally infeasible if the curve's conductor is too big.

To see how we can bypass this problem, we'll have to talk about logarithmic derivatives. The logarithmic derivative of a function  $f(s)$ is just
$$ \frac{d}{ds} \log(f(s)) = \frac{f^{\prime}(s)}{f(s)}. $$
Logarithmic derivatives have some nice properties. For one, they turn products into sums: if $f(s) = g(s)h(s)$, then $\frac{f^{\prime}}{f} = \frac{g^{\prime}}{g} + \frac{h^{\prime}}{h}$. Because we can write $L_E(s)$ as an Euler product indexed by the primes, we therefore have that
$$ \frac{L_E^{\prime}}{L_E}(s) = \sum_p \frac{d}{ds} \log\left(\left(1-a_p p^{-s} + \epsilon_p p^{-2s}\right)^{-1}\right).  $$
When you multiply this out, you rather amazingly get a Dirichlet series whose coefficients can be described in a very mathematically elegant way:
$$ \frac{L_E^{\prime}}{L_E}(1+s) = \sum_{n=1}^{\infty} c_n n^{-s}, $$
$$ c_n = \begin{cases}
\left(p+1-\#E(\mathbb{F}_{p^m})\right)\frac{\log p}{p^m}, & n = p^m \mbox{a perfect prime power,} \\
0 & \mbox{otherwise.}\end{cases} $$

That is, the $n$th coefficient of $\frac{L_E^{\prime}}{L_E}(1+s)$ is $0$ when $n$ is not a prime power, and when it is, the coefficient relates directly to the number of points on the reduced curve over the finite field of $n$ elements. [Note we're shifting the logarithmic derivative over to the left by one unit, as this puts the central point at the origin i.e. at $s=0$.]

How does this yield a way to estimate analytic rank? Via something called the Explicit Formula for elliptic curve $L$-functions - which, confusingly, is really more of a suite of related methods than any one specific formula.

Remember the Generalized Riemann Hypothesis? This asserts, in the case of elliptic curve $L$-functions at least, that all nontrivial zeros of $L_E(s)$ lie on the line $\Re(s) = 1$, and two zeros can never lie on top of each other except at the central point $s=1$. The number of such zeros at the central point is precisely the analytic rank of $E$. Each nontrivial zero can therefore be written as $1\pm i\gamma$ for some non-negative real value of $\gamma$ - it turns out that whenever $L_E(1+i\gamma)=0$, then $L_E(1-i\gamma)=0$ too, so noncentral zeros always come in pairs. We believe GRH to be true with a high degree of confidence, since in the 150 years that $L$-functions have been studied, no-one has ever found a non-trivial zero off the critical line.

We can invoke GRH with logarithmic derivatives and something called the Hadamard product representation of an entire function to get another formula for the logarithmic derivative of $L_E(s)$:
$$ \frac{L_E^{\prime}}{L_E}(1+s) = -\log\left(\frac{\sqrt{N}}{2\pi}\right) - \frac{\Gamma^{\prime}}{\Gamma}(1+s) + \sum_{\gamma} \frac{s}{s^2+\gamma^2}. $$
Here $\frac{\Gamma^{\prime}}{\Gamma}(s)$ is the logarithmic derivative of the Gamma function (known as the digamma function; this is a standard function in analysis that we know a lot about and can easily compute), and $\gamma$ runs over the imaginary parts of the zeros of $L_E(s)$ on the critical strip $\Re(s)=1$.

What does this get us? A way to equate a certain sum over the zeros of $L_E(s)$ to other more tractable functions and sums. When you combine the two formulae for $\frac{L_E^{\prime}}{L_E}(1+s)$ you get the following equality:

$$ \sum_{\gamma} \frac{s}{s^2+\gamma^2} = \log\left(\frac{\sqrt{N}}{2\pi}\right) + \frac{\Gamma^{\prime}}{\Gamma}(1+s) + \sum_n \frac{c_n}{n} n^{-s}.$$
The key realization here is that while computing the zeros of $L_E$ is in general nontrivial and time-consuming, all the other quantities in the equation above are readily computable. We can therefore evaluate on the left by computing the quantities on the right to sufficient precision, none of which require knowledge of the exact locations of the zeros of $L_E(s)$.

The above is the first and perhaps most basic example of an explicit formula for elliptic curve $L$-functions, but it is not the one we want. Specificially, trouble still arises in the fact that when were still dealing with an infinite sum on the right, and when we look close to the central point $s=0$, convergence is unworkably slow. We can, however, work with related explicit formulae that reduce the sum on the right to a finite one.

The formula we'll use in our code can be obtained from the above equation by dividing both sides by $s^2$ and taking inverse Laplace transforms; we can also formulate it in terms of Fourier transforms. Specifically, we get the following. Recall the sinc function $\mbox{sinc}(x) = \frac{\sin(\pi x)}{\pi x}$ with $\mbox{sinc}(0)=1$. Then we have:
$$ \sum_{\gamma} \mbox{sinc}(\Delta\gamma)^2 = Q(\Delta,N) + \sum_{\log n<2\pi\Delta} c_n\cdot\left(2\pi\Delta-\log n\right), $$
where $Q(\Delta,N)$ is a certain quantity depending only on $\Delta$ and $N$ that is easy to compute.

Let's look at the left hand side. Since $\mbox{sinc}(\Delta\gamma)^2$ is non-negative for any $\gamma$, and $\mbox{sinc}(0)^2 = 1$, when we sum over all zeros $\gamma$'s we get a value which must be strictly greater than the number of zeros at the central point (i.e. the analytic rank of $E$). Moreover, as we increase $\Delta$, the contribution to the sum from all the noncentral zeros goes to zero, as $\mbox{sinc}(x) \to 0$ as $x \to \infty$.

A graphic representation of the sum on the left side of the equation for the $L$-function attached to the elliptic curve $y^2 = x^3 + 103x - 51$ for three increasing values of the parameter $\Delta$. Vertical lines have been plotted at $x=\gamma$ whenever $L_E(1+i\gamma) = 0$, and the height of each line is given by the black curve $\mbox{sinc}(\Delta x)^2$. Thus summing up the length of the vertical lines gives you the value of the sum on the left side of the equation. We see that as $\Delta$ increases, the contribution from the blue lines - corresponding to noncentral zeros - goes to zero, while the contribution from the central zeros in red remain at 1 apiece. Since there are two central zeros (plotted on top of each other here), the sum will limit to 2 from above as $\Delta$ increases.

Now consider the right hand side of the sum. Note that the sum over $n$ is finite, so we can compute it to any given precision without having to worry about truncation error. Since $Q(\Delta,N)$ is easy to compute, this allows us to evaluate the entire right hand side with (relatively) little fuss. We therefore are able, via this equality, to compute a quantity that converges to $r$ from above as $\Delta \to \infty$.

A graphical representation of the finite sum $\sum_{\log n<2\pi\Delta} c_n\cdot\left(2\pi\Delta-\log n\right)$ for the same curve, with $\Delta=1$. The black lines are the triangular function $y = \pm (2\pi\Delta-x)$, whose value at $\log n$ weight the $c_n$ coefficient. Each blue vertical line is placed at $x=\log n$, and has height $c_n\cdot\left(2\pi\Delta-\log n\right)$. Summing up the signed lengths of the blue lines gives the value of the sum over $n$. Note that there are only 120 non-zero terms in this sum when $\Delta=1$, so it is quick to compute.

This formula, and others akin to it, will form the core functionality of the code I will be writing and adapting to include in Sage, as they give us a way to quickly (at least when $\Delta$ is small) show that the analytic rank of a curve can't be too big.

There are some downsides, of course, most notable of which is that the number of terms in the finite sum on the right is exponential in $\Delta$. Tests show that on a laptop one can evaluate the sum in a few milliseconds for $\Delta=1$, a few seconds for $\Delta = 2$, but the computation takes on the order of days when $\Delta=4$. Nevertheless, for the majority of the elliptic curves we'll be looking at, $\Delta$ values of 2 or less will yield rank bounds that are in fact tight, so this method still has much utility.

Another issue that warrants investigation is to analyze just how fast - or slow - the convergence to $r$ will be as a function of increasing $\Delta$; this will allow us to determine what size $\Delta$ we need to pick a priori to get a good estimate out. But this is a topic for a future date; the next post or two will be dedicated toward detailing how I'm implementing this rank estimation functionality in Sage.

by Simon Spicer ( at June 19, 2014 11:36 AM

June 16, 2014

Amit Jamadagni


Hello everyone,
This week we focused on the implementation details of the Vogel’s algorithm. I will get back to this algorithm at a later stage. To start off, last week we had a problem in generating the planar diagram code. But that was resolved using the braid as input. It seems braid word has all the information to generate the planar diagram (unique). The problem with other inputs like the gauss code or the dt code is we get a planar diagram but which component of the over-crossing comes first when we move in the clockwise direction is what is the problem. But in braidword we have this information in the form of -1 and +1, which happens to provide this information. This was achieved at the very beginning of the last week. Ever since it has been the implementation of Vogel’s algorithm, which takes in the oriented gauss code and converts it to the braid word. We have had to work on the oriented gauss code as the parameter to the method, it is not in a great shape but still the main idea has been achieved. (The minor problem is we have to pass the oriented gauss code as a parameter for the method). The difference here is, for every crossing we take how the crossing is oriented, so the name oriented gauss code. So this allows us to convert from oriented gauss code to planar diagram. The next thing was to determine the regions and then the Seifert circles. There are two parts to the algorithm of which one is to identify the unoriented Seifert circles and then perform a Reidmester move. Once this move is done then look for other unoriented Seifert circles. Once we end up with no unoriented circles we get the braidword which forms the second part. So in the first part we are able to convert the oriented gauss code to Planar Diagram, we are working on looking into how we can get regions which are bounded by the components. Next step is to identify the Seifert circles and then detect the unoriented pair. We have used different approaches for detecting the regions, like the Directed graphs where the crossings form the vertices and they have an edge if they have a component or two in common. But this made things a bit complicated as directed graphs cycles (g._show_all_cycles() ) returned more data and it was a bit difficult to get the exact regions. We are working on this as of now. Miguel has sent in his ideas on this matter and I am looking into it. The idea now is to move around the crossings in a given fashion and then work out the regions and then move on. I had another approach which I took time to read, this was the approach of Andrew who has implemented a version of Vogel’s algorithm in the Braid programme. It is somewhat on the similar lines but we choose to re think as it had some huge information to calculate after each step. I have sent in a pull request with the work

Documentation work still remains of the newly implemented methods and we just got these methods working we still need to re think on the design, two to three things that we are lacking as of now:
1. Documentation
2. Redesigning the currently implemented methods.
3. Setting the minor issues which have been already identified from the previous pull requests. (in-line comments on the previous pull requests have to be worked upon).


by esornep at June 16, 2014 09:20 AM

June 12, 2014

William Stein

SageMathCloud Task Lists

I've added task list functionality to SageMathCloud (SMC), so you can keep track of a list of things to do related to a project, paper, etc. Task lists are saved as files on the filesystem, so they can be backed up in the usual way, automatically generated, etc. I doubt anybody is going to use SMC just for the tasks lists, but for people already using SMC in order to use Sage, write LaTeX documents, use IPython notebooks, etc., having a convenient integrated task list should come in handy.
To create a task list, in a project, click "+New", name the task list, then click the "Task List" button.  Here's what a task list looks like:

The Design

I used the task list quite a bit when implementing the task list, and significantly modified the interface many, many times. I also tried out numerous other todo list programs for inspiration. I eventually settled on the following key design choices, some of which are different than anything I've seen. In particular, the design targets highly technical users, which is not something I saw with any other todo list programs I tried.
  • Markdown editor: The task description is formatted using client-side rendered Github flavored markdown (using marked), including [ ] for checkboxes. I also include full MathJax support, and spent quite a while working around various subtleties of mixing mathjax and markdown. I'll be rewriting Sage worksheets to use this code. The editor itself is Codemirror 4 in markdown mode, so it respects whatever theme you choose, has nice features like multiple cursors, paren matching, vim/emacs modes, etc. Multiple people can edit the same task at once and everybody will see the changes as they are made (note: I haven't implemented showing other cursors.)
  • Relative dates and times: All dates and times are shown relative to right now. If something is due in 20 hours, it says "due about 20 hours from now". I also included a sortable column with the last time when a task was edited, also shown relative to the current time. This display uses the timeago jquery plugin. You can of course click on the due date to see the actual date.
  • Hashtags: As I was implementing (and removing) features such as priorities, a way to indicate which task you are currently working on, folders, etc, I decided that hashtags can provide every feature. Moreover, they are very text editor/programmer friendly. When editing a task, if you put #foo, then it is interpreted as a hashtag, which you can click on to show only tasks containing that tag. To disambiguate with markdown headings, to make a heading you have to include a whitespace, so # foo. I haven't added autocomplete for hashtags yet, but will. You can easily click on hashtags anywhere to select them, or use the bar at the top.
  • User-specified task order: The default sort order for tasks is custom. There is a drag handle so you can explicitly move tasks up and down in order to indicate how important they are to you (or whatever else). You can also click an up hand or down hand to knock the currently selected task to the bottom of the list of displayed tasks.
Of course, I still have an enormous list of functionality I would like to add, but that will have to wait. For example, I need to enable a chat associated to each task list, like the chats associated to Sage worksheets and other files. I also want to make it so one can select a range of tasks and copy them, move them, paste them into another list, etc. It would also be nice to be able to export task lists to HTML, PDF, etc., which should be fairly easy using pandoc.  I'm also considering making a note list, which is like a task list but without the due date or "done" box.  Because of all the infrastructure already there, it would be easy to add code evaluation functionality, thus creating something like worksheets, but from a different point of view (with maybe hashtags determining the Python process).

Databases and Differential Synchronization

One interesting thing I noticed when implementing task lists is that there are many similarities with the original design (and also IPython notebook), in which a worksheet is a list of cells that get evaluated, can be manually reordered, etc. Similarly, a task list is a list of "cells" that you edit, manually reorder, etc. With sagenb we had longstanding issues involving the order of each cell and assigning an integer numerical id (0, 1, 2, ...) to the cells, which resulted in things like cells seemingly getting randomly reordered, etc. Also, having multiple simultaneous viewers with automatic synchronization is very difficult with that model. For task lists, I've introduced some new models to address these issues.

A natural way to store a task list is in a database, and I initially spent some time coming up with a good database schema and implementing basic lists using Cassandra for the backend. However, I couldn't pursue this approach further, since I was concerned about how to implement realtime synchronization, and also about the lack of easily backing up complete task lists via snapshots, in git, etc. So instead I created an "object database" API built on top of a file that is synchronized across clients (and the server) using differential synchronization. The underlying file format for the database is straightforward -- there is one line in JSON format for each record in the database. When objects are changed, the file is suitably modified, synchronized to other clients, and events are triggered.

Since differential synchronization has no trouble with files that have "a few thousand lines", this approach works well for our purposes (since personal or shared todo lists are typically fairly short). Also, having one line per record is at least more git friendly than something like a sqlite database. I'm considering rewriting my implementation of IPython notebook sync on top of this abstraction.
Since I view the task list as a database, each task has a globally unique uuid. Also, instead of viewing the task order as being defined by an integer 0,1,2,3, which leads to all manner of bugs and programming misery, race conditions, etc., instead we view the order as being determined by floating point positions. So to insert a task between tasks with positions 1 and 2, we just give the task position 1.5.

by William Stein ( at June 12, 2014 01:31 PM

June 11, 2014

Simon Spicer

How to compute with elliptic curve L-functions and estimate analytic rank

In a previous post I defined what an elliptic curve $L$-function $L_E(s)$ is, and indicated why the behaviour of the function at the central point $s=1$ is of such great interest: the order of vanishing of $L_E(s)$ at $s=1$ is conjecturally precisely equal to the algebraic rank of $E$.

Because this equality is still conjectural, we will cal the former quantity -- then number of derivitaves of $L_E(s)$ that vanish at $s=1$ - the analytic rank of $E$. The topic of this post is to address the question: given an elliptic curve $E$, how do we go about computing its analytic rank?

Before we can hope to answer this question, we need to know how to evaluate $L_E(s)$ itself for any given $s$. In the previous post I gave both the Euler product and Dirichlet series definitions for $L_E(s)$; to jog your memory, here's the Euler product of $L_E(s)$:
$$ L_E(s) = \prod_p \left(1-a_p p^{-s} + \epsilon_p p^{-2s}\right)^{-1}, $$
where the product runs over all prime numbers, $a_p = p+1-\#\{E(\mathbb{F}_p)\}$, and $\epsilon_p = 0$ if $p$ divides the conductor of $E$ and $1$ otherwise. The Dirichlet series is $L_E(s) = \sum_{n}a_n n^{-s}$, which is precisely what you get when you multiply out the Euler product. Note that we are suppresing the dependence on $E$ in both the $a_n$ and $\epsilon_p$ constants.

However, both the Euler product and Dirichlet series representations of $L_E(s)$ will only converge absolutely when the real part of $s$ exceeds $\frac{3}{2}$. Although the Sato-Tate Conjecture (now a theorem, but the name has stuck) implies that the expansions will in fact converge conditionally for $\Re{s}>\frac{1}{2}$, the convergence is so slow that attempting to evaluate $L_E(s)$ near the central point by multiplying or summing enough terms is horribly inefficient. As such, we need a better way to evaluate $L_E(s)$ - and thankfully, such better ways do indeed exit.

Remember how I mentioned in the previous post that $L$-functions obey a very nice symmetry condition? Well, here's that condition exactly: first, we need to define something called the completed $L$-function $\Lambda_E(s)$. This is just $L_E(s)$ multiplied by some extra factors. Specifically,
$$\Lambda_E(s) = N^{\frac{s}{2}}(2\pi)^{-s}\Gamma(s) L_E(s), $$
where $N$ is the conductor of $E$ and $\Gamma(s)$ is the usual Gamma function on $\mathbb{C}$ (the one that gives you $(n-1)!$ when you evaluate it at the positive integer $s=n$).

We can show that $\Lambda_E(s)$ is entire on $\mathbb{C}$; that is, it doesn't have any poles. Moreover, $\Lambda_E(s)$ obeys the glorious symmetry property
$$ \Lambda_E(s) = w_E \Lambda_E(2-s), $$
where $w_E$ is either $1$ or $-1$ and depends on $E$. This is called the functional equation for the $L$-function attached to $E$.

Another way to put it is that shifted completed $L$-function $\Lambda_E(1+s)$ is either an even or an odd function of $s$. Because the factors $N^{\frac{s}{2}}$, $(2\pi)^{-s}$ and $\Gamma(s)$ are all readily computable, this allows us to determine the value of $L_E(s)$ when the real part of $s$ is less than $\frac{1}{2}$.

What's left, then, is to figure out how to efficinetly evaluate $L_E(s)$ in the strip $\frac{1}{2} \le \Re(s) \le \frac{3}{2}$. This is called the critical strip for the $L$-function, and it is here that the behaviour of the function is most interesting.

[Aside: we can, for example, show that $\Lambda_E(1+s)$ is never zero outside of the critical strip. The Generalized Riemann Hypothesis in fact asserts that elliptic curve $L$-functions are only ever zero along the exact center of this strip, the critical line $\Re(s)=1$. We'll get back to the zeros of elliptic curve $L$-functions in a later post.]

To evaluate $\Lambda_E(s)$ in the critical strip, we make use of the modularity of $E$. The modularity theorem states that elliptic curves are modular: for every elliptic curve $E$ over the rationals there exists a weight 2 cuspidal eigenform $f_E$ of level $N$ (where $N$ is precisely the conductor of $E$), such that the $a_n$ of $E$ as defined previously equal the Fourier coefficients of $f_E$. If you haven't studied modular forms before, the crucial piece of knowledge is that there is a natural procedure for constructing $L$-functions from modular forms in such a way that the definition makes sense for all complex inputs $s$, and that the $L$-function attached to the cusp form $f_E$ will exactly equal the elliptic curve $L$-function $L_E(s)$. This is in fact how we show that elliptic curve $L$-functions can be analytically continued to all of $\mathbb{C}$.

The take-away is that via modularity there is an infinite sum representation for $L_E(s)$ which converges absolutely for all $s$. Here it is: define the auxiliary function
$$\lambda_E(s) = \left(\frac{\sqrt{N}}{2\pi}\right)^{s} \sum_{n=1}^\infty a_n n^{-s}\Gamma \left(s,\frac{2\pi}{\sqrt{N}}\cdot n\right), $$
where all the quantities are as defined previously, and $\Gamma(s,x)$ is the upper incomplete Gamma function (note that since $\Gamma(s,x)$ goes to zero exponentially as $x$ goes to infinity, this sum will converge absolutely for any $s$, with rate of convergence scaling with $\sqrt{N}$). Then we have
$$ \Lambda_E(s) = \lambda_E(s) + w_E \lambda_E(2-s). $$
Because we know how to compute incomplete Gamma functions, this gives us a way to evaluate $\Lambda_E(s)$, and thus $L_E(s)$, in the critical strip. The upside with this formula and variations thereof is that it works for any value of $s$ you stick in - including values near $s=1$. Similar formulae exist for the derivatives of $L_E(s)$, so we can in theory compute $L_E^{(n)}(1)$, the $n$th derivative of $L_E(s)$ at the central point, to any degree of accuracy for any $n$.

Thus if we want to compute a curve's analytic rank, what's stopping us from just evaluating successive derivatives of $L_E(s)$ at $s=1$ until we hit one which is not zero?

Two reasons. The first is that there's no way around the fact that you need about $\sqrt{N}$ terms to compute $L_E(s)$ or its derivatives to decent precision. If the conductor of the curve is too big, as is often the case, it takes an inordinate amount of time to simply evaluate the $L$-function near the central point. This makes direct evaluation impractical for all but the smallest-conductor curves -- and for those curves we can usually compute rank via other methods anyway.

The second reason is a more subtle one: how do you tell numerically if the $n$th derivative of $L_E(s)$ at $s=1$ is zero? If you think about it, it's easy to answer this question in one direction only: if you evaluate $L_E^{(n)}(1)$ to some precision and get digits that aren't all zero, then (assuming your code is correct) the $n$th derivative of $L_E(s)$ does not vanish. However, no matter how many digits of precision we compute getting all zeros, the possibility will always remain that the next digit along might not be zero.

In general, there is no numerical way to determine if the value of a black-box complex-valued function at a point is zero, or just really really close to zero. This is why, if you look in the literature, you'll find "methods to estimate analytic rank", but never to compute the quantity exactly. It's impossible to do without extra knowledge. Specifically, we'd need to have some sort of a computable lower bound on the size of the derivatives of $L_E(s)$ at the central. Unfortunately, no such theorems currently exist, so we're stuck with estimating analytic rank for now.

Thankfully, the $\sqrt{N}$-dependence issue is more hopeful. The next post will detail a method that provides good estimates for the analytic rank that scales much more slowly with the curve's conductor.

by Simon Spicer ( at June 11, 2014 04:57 PM

Mathematical background: elliptic curves and L-functions

Day 1 of my Google Summer of Code project! I will try post updates at least once a week; I thought a good starting point would be to give an introduction to the mathematical objects that this project revolves around. The next post will then give a more detailed description of the project itself, and the structure of the code that I'm going to adapt, write and include in Sage.

To that end, for this post I will assume some  knowledge of elementary number theory, algebra and  complex analysis, but nothing more complicated than that.

Let $E$ be an elliptic curve over the rational numbers. We can think of $E$ as the set of rational solutions $(x,y)$ to a two-variable cubic equation in the form:
$$ E: y^2 = x^3 + Ax + B $$
for some integers $A$ and $B$, along with an extra "point at infinity". An important criterion is that the $E$ be a smooth curve; this translates to the requirement that the discriminant of the curve, given by $-16(4A^3+27B^2)$, is not zero.

One of the natural questions to ask when considering an elliptic curve is "how many rational solutions are there?" It turns out elliptic curves fall in that sweet spot where the answer could be zero, finitely many or infinitely many - and figuring out which is the case is a deeply non-trivial problem.

The rational solutions form an abelian group with a well-defined group operation that can be easily computed. By a theorem of Mordell, the group of rational points on an elliptic curve $E(\mathbb{Q})$ is finitely generated; we can therefore write
$$ E(\mathbb{Q}) \approx T \times \mathbb{Z}^r,$$
where $T$ is a finite group (called the torsion subgroup of $E$), and $r$ is denoted the algebraic rank of $E$.

The elliptic curve group law explained visually: three points in a straight line add to zero; because the point at infinity is the identity element, this means that the sum R of two points P and Q is the reflection about the real axis of the other point on the curve on the straight line connecting P and Q.

Determining the torsion subgroup of $E$ is a relatively straightforward endeavor. By a theorem of Mazur, rational elliptic curves have torsion subgroups that are (non-canonically) isomorphic to one of precisely fifteen possibilities: $\mathbb{Z}/n\mathbb{Z}$ for $n = 1$ through $10$ or $12$; or $\mathbb{Z}/2\mathbb{Z}\oplus \mathbb{Z}/2n\mathbb{Z}$ for $n = 1$ though $4$. Computing the rank $r$ - the number of independent rational points on $E$ - is the hard part, and it is towards this end that this project hopes to contribute.

Perhaps surprisingly, we can translate the algebraic problem of finding rational solutions to $E$ to an analytic one - at least conjecturally. To understand this we'll need to know what an elliptic curve $L$-function is. These are holomorphic functions defined on the whole complex plane that somehow encode a great deal of information about the elliptic curve they're attached to.

The definition goes as follows: for each prime $p$, count the number of solutions to the elliptic curve equation modulo $p$; we'll call this number $N_p(E)$. Then define the number $a_p(E)$ by
$$ a_p(E) = p+1 - N_p(E). $$
Hasse's Theorem states that $a_p(E)$ is always less that $2\sqrt{p}$ in magnitude for any $p$, and the Sato-Tate conjecure (recently proven by Taylor et al) states that for a fixed elliptic curve, the $a_p$ (suitably transformed) are asymptotically distributed in a semi-circular distribution about zero.

Next, for a given $p$ define the local factor $L_p(s)$ to be the function of the complex variable $s$ as follows:
$$ L_p(s) = \left(1-a_p(E)p^{-s} + \epsilon(p)p^{-2s}\right)^{-1}, $$
where $\epsilon(p)$ is 0 if $p$ divides the conductor of $E$, and 1 otherwise.

The conductor of $E$ is a positive integer that encodes the bad reduction of $E$: for a finite list of primes $p$, when we reduce the curve modulo $p$, we don't get a smooth curve but rather a singular one instead. The conductor is just the product of these primes to the first or second power (or in the cases $p=2$ or $3$, up to the eighth and fifth powers respectively). If you're unfamiliar with elliptic curves, the thing to note is that the conductor always divides the discriminant of $E$, namely the quantity $-16(4A^3+27B^2)$ mentioned previously.

Finally we can define the $L$-function attached to $E$:
$$ L_E(s) = \prod_p L_p(s) = \prod_p \left(1-a_p(E)p^{-s} + \epsilon(p)p^{-2s}\right)^{-1}. $$
The above representation of $L_E(s)$ is called the Euler product form of the $L$-function. If we multiply out the terms and use power series inversion we can also write $L_E(s)$ as a Dirichlet series:
$$ L_E(s) = \sum_{n=1}^{\infty} a_n(E) n^{-s}, $$
where for non-prime $n$ the coefficients $a_n$ are defined to be exactly the integers you get when you multiply out the Euler expansion.

If you do some analysis, using Hasse's bound on the size of the $a_p(E)$ and their distribution according to Sato-Tate, one can show that the above to representations only converge absolutely when the real part of $s$ is greater than $\frac{3}{2}$, and conditionally for $\Re(s)>\frac{1}{2}$. However, the modularity theorem states that these elliptic curve $L$-functions can actually be analytically continued to the entire complex plane. That is, for every elliptic curve $L$-function $L_E(s)$ as defined above, there is an entire function on $\mathbb{C}$ which agrees with the Euler product/Dirichlet series definition for $\Re(s)>1$, but is also defined - and readily computable - for all other complex values of $s$. This entire function is what we actually call the $L$-function attached to $E$.

The way we analytically continue $L_E(s)$ yields that the function is highly symmetric about the line $\Re(s)=1$; moreover, because the function is defined by real coefficients $L_E(s)$ also obeys a reflection symmetry along the real axis. The point $s=1$ is in a very real sense therefore the central value for the $L$-function. It thus makes sense to investigate the behaviour of the function around this point.

Because $L_E(s)$ is entire, it has a Taylor expansion at any given point. We can ask what the Taylor expansion of $L_E(s)$ about the point $s=1$ is, for example. One of the central unproven conjectures in modern-day number theory is the Birch and Swinnerton-Dyer Conjecture: that the order of vanishing of the Taylor expansions of $L_E(s)$ about the point $s=1$ is precisely $r$, the algebraic rank of the elliptic curve $E$. That is, if we let $z=s-1$ so that
$$ L_E(s) = c_0 + c_1 z + c_2 z^2 + \ldots $$
is the expansion of $L_E(s)$ about the central point, the BSD conjecture holds that $c_0$ through $c_{r-1}$ are all zero, and $c_r$ is not zero.

The values of three elliptic curve L-functions along the critical line $1+it$ for $-10<=1<=10$. Blue corresponds to the curve $y^2 = x^3 - 13392x - 1080432$, a rank 0 curve, red is that of $y^2 = x^3 - 16x + 16$, a rank 1 curve, and green is $y^2 = x^3 - 3024x + 46224$, a rank 2 curve. Note that close to the origin the graphs look like non-zero constant function, a straight line through the origin and a parabola respectively.

Thus if we can compute the order of vanishing of the elliptic curve $L$-function at the central point, we can at least conjecturally compute the rank of the curve. This converts an algebraic problem into a numerical one, which is perhaps more tractible.

The techniques we'll use to attempt to do this will be the subject of the next blog post. Unfortunately there are still plenty of challenges to this approach - the least of which boils down to: how do you numerically determine if the value of a function is zero, or just really, really close to zero? The short answer is that without theorems to back you up, you can't -- but we can still make considerable progress toward the problem of computing an elliptic curve's rank.

by Simon Spicer ( at June 11, 2014 02:48 PM