July 21, 2014

Amit Jamadagni

Conversions, cleaning …

Hello everyone, The last week we focused on the conversions. Some parts are ready for the knots, in sense the standard input conversion is done but there is lots more to add to it. We are returning the braid word … Continue reading

by esornep at July 21, 2014 09:41 PM

Simon Spicer

How to efficiently enumerate prime numbers

There's a part in my code that requires me to evaluate a certain sum
$$ \sum_{p\text{ prime }< \,M} h_E(p) $$
where $h_E$ is a function related to specified elliptic curve that can be evaluated efficiently, and $M$ is a given bound that I know. That is, I need to evaluate the function h_E at all the prime numbers less than $t$, and then add all those values up.

The question I hope to address in this post is: how can we do this efficiently as $M$ gets bigger and bigger? Specifically, what is the best way to compute a sum over all prime numbers up to a given bound when that bound can be very large?

[For those who have read my previous posts (you can skip this paragraph if you haven't - it's not the main thrust of this post), what I want to compute is, for an elliptic curve $E/\mathbb{Q}$, the analytic rank bounding sum $ \sum_{\gamma} \text{sinc}^2(\Delta \gamma) $ over the zeros of $L_E(s)$ for positive parameter $\Delta$; this requires us to evaluate the sum $ \sum_{n < \exp(2\pi\Delta)} c_n\cdot(2\pi\Delta-\log n)$. Here the $c_n$  are the logarithmic derivative coefficients of the completed $L$-function of $E$. Crucially $c_n = 0$ whenever $n$ isn't a prime power, and we can lump together all the terms coming from the same prime; we can therefore express the sum in the form you see in the first paragraph.]

As with so many things in mathematical programming, there is a simple but inefficient way to do this, and then there are more complicated and ugly ways that will be much faster. And as has been the case with other aspects of my code, I've initially gone with the first option to make sure that my code is mathematically correct, and then gone back later and reworked the relevant methods in an attempt to speed things up.


Here's a Python function that will evaluate the sum over primes. The function takes two inputs: a function $h_E$ and an integer $M$, and returns a value equal to the sum of $h_E(p)$ for all primes less than $M$. We're assuming here that the primality testing function is_prime() is predefined.

As you can see, we can achieve the desired outcome in a whopping six lines of code. Nothing mysterious going on here: we simply iterate over all integers less than our bound and test each one for primality; if that integer is prime, then we evaluate the function h_E at that integer and add the result to y. The variable y is then returned at the end.

Why is this a bad way to evaluate the sum? Because there are far more composite integers than there are primes. According to the prime number theorem, the proportion of integers up to $M$ that are prime is approximately $\frac{1}{\log M}$. For my code I want to compute with bounds in the order of $M = e^{8\pi} \sim 10^{11}$; the proportion of integers that are prime up to this bound value is correspondingly about $\frac{1}{8\pi} \sim 0.04$. That is, 96% of the integers we iterate over aren't prime, and we end up throwing that cycle away.

Just how inefficient this method is of course depends on how quickly we can evaluate the primality testing function is_prime(). The best known deterministic primality testing algorithm has running time that scales with (at most) the 6th power of $\log n$, where $n$ is the number being tested. This places primality testing in a class of algorithms called Polynomial Time Complexity Algorithms, which means the runtime of the function scales relatively well with the size of the input. However, what kills us here is the sheer number of times we have to call is_prime() - on all integers up to our bound $M$ - so even if it ran in constant time the prime_sum() function's running time is going to scale with the magnitude of $M$.


We can speed things up considerably by noting that apart from 2, all prime numbers are odd. We are therefore wasting a huge amount of time running primality tests on integers that we know a priori are composite. Assuming is_prime() takes a similar time to execute than our coefficient function h_E(), we could therefore roughly halve the runtime of the prime sum function by skipping the even integers and just checking odd numbers for primality.

We can go further. Apart from 2 and 3, all primes yield a remainder of 1 or 5 when you divide them by 6 (because all primes except for 2 are 1 (modulo 2) and all primes except for 3 are 1 or 2 (modulo 3)). We can therefore skip all integers that are 0, 2, 3 or 4 modulo 6; this means we only have to check for primality on only one third of all the integers less than $M$.

Here's a second version of the prime_sum() function that does this:

Of course we could go even further with the technique by looking at remainders modulo $p$ for more primes $p$ and combining the results: for example, all primes outside of 2, 3 and 5 can only have a remainder of 7, 11, 13, 17, 19, 23 or 29 modulo 30. However, the further you go the more special cases you need to consider, and the uglier your code becomes; as you can see, just looking at cases modulo 6 requires us to write a function about three times as long as previously. This method therefore will only be able to take us so far before the code we'd need to write would become too unwieldy for practicality.


This second prime_sum() version is a rudimentary example of a technique called prime sieving. The idea is to use quick computations to eliminate a large percentage of integers from consideration in a way that doesn't involve direct primality testing, since this is computationally expensive. Sieving techniques are an entire field of research in their own right, so I thought I'd just give as example one of the most famous methods: the Sieve of Eratosthenes (named after the ancient Greek mathematician who is thought to first come up with the idea). This takes as input a positive bound $M$ and returns a list of all prime numbers less than $M$. The method goes as follows:
  1. Start with a list of boolean flags indexed by the numbers 2 through $M$, and set all of them to True. 
  2. Starting at the beginning of the list, let $i$ be the index of the first True entry. Set all entries at indices a multiples of $i$ to False.
  3. Repeat step 2 until the first True entry is at index $> \sqrt{M}$.
  4. Return a list of all integers $i$ such that the entry at index $i$ is True.
This is definitely a case where a (moving) picture is worth a thousand words:
A good graphic representation of the Sieve of Eratosthenes being used to generate all primes less than 121. Courtesy Wikipedia: "Sieve of Eratosthenes animation". Licensed under CC BY-SA 3.0 via Wikimedia Commons
How and why does this work? By mathematical induction, at each iteration the index of the first True entry will always be prime. However any multiple thereof is by definition composite, so we can walk along the list and flag them as not prime. Wash, rinse, repeat. We can stop at $\sqrt{M}$, since all composite numbers at most $M$ in magnitude must have at least one prime factor at most $\sqrt{M}$ in size.

Here is a third version of our prime_sum() function that utilizes the Sieve of Eratosthenes:

Let's see how the three versions stack up against each other time-wise in the Sage terminal. I've saved the three functions in a file called prime_sum_functions.py, which I then import up front (if you want to do the same yourself, you'll need to import or define appropriate is_prime() and sqrt() functions at the top of the file). I've also defined a sample toy function h_E() and bound M:

sage: from prime_sum_functions import *
sage: def h_E(n): return sin(float(n))/float(n)
sage: M = 10000
sage: prime_sum_v1(h_E,M)
sage: prime_sum_v2(h_E,M)
sage: prime_sum_v3(h_E,M)
sage: %timeit prime_sum_v1(h_E,M)
1 loops, best of 3: 363 ms per loop
sage: %timeit prime_sum_v2(h_E,M)
1 loops, best of 3: 206 ms per loop
sage: %timeit prime_sum_v3(h_E,M)
10 loops, best of 3: 86.8 ms per loop

Good news! All three functions (thankfully) produce the same result. And we see version 2 is about 1.8 times faster than version 1, while version 3 is four times as fast. These ratios remained roughly the same when I timed the functions on larger bounds, which indicates that the three versions have the same or similar asymptotic scaling - this should be expected, since no matter what we do we will always have to check something at each integer up to the bound.


It should be noted, however, that the Sieve of Eratosthenes as implemented above would be a terrible choice for my GSoC code. This is because in order to enumerate the primes up to $M$ we need to create a list in memory of size $M$. This isn't an issue when $M$ is small, but for my code I need $M \sim 10^{11}$; an array of booleans that size would take up about 12 gigabytes in memory, and any speedups we get from not having to check for primality would be completely obliterated by read/write slowdowns due to working with an array that size. In other words, while the Sieve of Eratosthenes has great time complexity, it has abysmal space complexity.

Thankfully, more memory-efficient sieving methods exist that drastically cut down the space requirements. The best of these - for example, the Sieve of Atkin - need about $\sqrt{M}$ space. For $M \sim 10^{11}$ this translates to only about 40 kilobytes; much more manageable.

Of course, there's always a downside: bleeding edge prime enumeration methods are finicky and intricate, and there are a plethora of ways to get it wrong when implementing them. At some point squeezing an extra epsilon of speedup from your code is no longer worth it in terms of the time and effort it will take to get there. For now, I've implemented a more optimized version of the second prime_sum() function in my code (where we skip over all integers that are obviously not prime), since for now that is my happy middle ground.  If I have time at the end of the project I will revisit the issue of efficient prime enumeration and try implement a more optimized sieving method, but that is a tomorrow problem.

by Simon Spicer (noreply@blogger.com) at July 21, 2014 04:00 PM

July 19, 2014

Vince Knight

Using Github pages and Python to distribute my conference talks

I’m very conscious about wanting to share as much of what I do as a research/instructor in as easy a way as possible. One example of this is the slides/decks that I use for talks I give. 3 or 4 years ago I was doing this with a Dropbox link. More recently this has lead to me putting everything on github but this wasn’t ideal as I’ve started to accumulate a large number of repositories for the various talks I give.

One of the great things about using github though is that for html presentations (reveal.js is the one I’ve used a couple of times), you can use the github deployement branch gh-pages to serve the files directly. A while back I put a video together showing how to do this:

So there are positives and negatives. After moving to a jekyll framework for my blog I started thinking of a way of getting all the positives without any negatives…

  • I want to use a git+github workflow;
  • I don’t want to have a bunch of separate repositories anymore;
  • I want to be able to just write my talks and they automatically appear online.

My immediate thought was to just use jekyll but as far as I can tell I’d have to hack it a bit to get blog posts be actual .pdf files (for my beamer slides) (please tell me if I’m wrong!). I could of course write a short blog post with a link to the file but this was one extra step to just writing my talk that I didn’t want to have to do. Instead of hacking jekyll a bit I decided to write a very simple Python script. You can see it here but I’ll use this blog post to just walk through how it works.

I have a Talks repository:

~$ cd Talks
Talks$ ls


What you can see in there are 3 directories (each starting with a date) and various other files. In each of the talk directories I just have normal files for those talks:

Talks$ cd 2014-07-25-Measuring-the-Price-of-Anarchy-in-Critical-Care-Unit-Interactions/
...$ ls

2014-07-25-Measuring-the-Price-of-Anarchy-in-Critical-Care-Unit-Interactions.nav    2014-07-25-Measuring-the-Price-of-Anarchy-in-Critical-Care-Unit-Interactions.snm
2014-07-25-Measuring-the-Price-of-Anarchy-in-Critical-Care-Unit-Interactions.pdf    2014-07-25-Measuring-the-Price-of-Anarchy-in-Critical-Care-Unit-Interactions.tex

I can work on those slides just as I normally would. Once I’m ready I go back to the root of my Talks directory and run the serve.py script:

Talks$ python serve.py

This file automatically goes through my sub-directories reading the date from the directory names and identifying .html or .pdf files as talks. This creates the index.html file which is an index of all my talks (sorted by date) with a link to the right file. To get the site online you simply need to push it to the gh-pages branch of your github repository.

You can see the site here: drvinceknight.github.io/Talks. Clicking on relevant links brings up the live version of my talk, or at least as live as my latest push to the github gh-pages branch.

The python script itself is just:

 1 #!/usr/bin/env python
 2 """
 3 Script to index the talks in this repository and create an index.html file.
 4 """
 5 import os
 6 import glob
 7 import re
 9 root = "./"
10 directories = sorted([name for name in os.listdir(root) if os.path.isdir(os.path.join(root, name))], reverse=True)
11 talk_formats = ['.pdf', '.html']
14 index = open('index.html', 'w')
15 index.write(open('head.html', 'r').read())
16 index.write(open('header.html', 'r').read())
18 index.write("""
19             <body>
20             <div class="page-content">
21             <div class="wrap">
22             <div class="home">
23             <ul class='posts'>""")
25 for dir in directories:
26     if dir not in ['.git', 'reveal.js']:
27         talk = [f for f in glob.glob(root + dir + "/" + dir[:10] + '*') if  os.path.splitext(f)[1] in talk_formats][0]
28         date = talk[len(root+dir) + 1: len(root + dir) + 11]
29         title, extension =  os.path.splitext(talk[len(root+dir)+11:].replace("-", " "))
30         index.write("""
31                     <li>
32                     <span class="post-date">%s [%s]</span>
33                     <a class="post-link" href="%s">%s</a>
34                     <p>
35                     <a href="%s">(Source files)</a>
36                     </p>
37                     </li>
38                     """ % (date, extension[1:], talk, title, 'https://github.com/drvinceknight/Talks/tree/gh-pages/' + dir ))
39 index.write("""
40             </ul>
41             </div>
42             </div>
43             </div>
44             """)
45 index.write(open('footer.html', 'r').read())
46 index.write("</body>")
47 index.close()

The main lines that do anything are lines 25-38, everything else just reads in the relevant header and footer files.

So now getting my talks written and online is hardly an effort at all:

# Write awesome talk
Talks$ git commit -am 'Finished talk on proof of finite number of primes'  # This I would do anyway
Talks$ python serve.py  # This is the 1 extra thing I need to do
Talks$ git push origin  # This I would do anyway

There are various things that could be done to improve this (including pushing via serve.py as an option) and I’m not completely convinced I can’t just use jekyll for this but it was quicker to write that script then to figure it out (or at least that was my conclusion after googling twice).

If anyone has any fixes/improvements (including: “You idiot just run jekyll with the build-academic-conference-talk-site flag”) that’d be super appreciated and if you want to see the Talk repository (with python script, css files, header.html etc…) it’s here: github.com/drvinceknight/Talks.

Now to finish writing my talk for ORAHS2014

July 19, 2014 12:00 AM

July 15, 2014

Vince Knight

Three Days: Two Higher Ed Conferences

Last week I took part in two conferences on the subject of higher education and so here’s a blog post with some thoughts and reflections.

Monday and Tuesday: ‘Workshop on Innovations in University Mathematics Teaching’

The first conference was organised by +Paul Harper, Rob Wilson and myself. The conference website can be found here. The main subject of this was active learning pedagogic techniques, in particular:

  • The flipped classroom;
  • Inquiry Based Learning (IBL).

The plan for these two days included an almost full day of talks on the Monday and an interactive IBL session on the Tuesday.

Here’s a couple of snippets from each session:

  • Robert Talbert gave the opening talk describing the flipped learning environment. You can see his slides here.

    I was quite nervous about Robert’s talk as he’s an expert in flipping classrooms and I was scheduled to talk after him. He gave a great talk (in fairness every single talk on the day was awesome) and here are a couple of things I noted down:

    A flipped classroom does not imply a flipped learning environment!

    A traditional classroom encourages the dependency of a student on the instructor.

    Flipped learning is not just videos out of class and homework in class.

    My favourite:

    The most important part of the flipped classroom is not what happens outside of the classroom (videos etc…) but what happens inside the classroom.

  • I spoke next and you can find my slides here.

    I mainly spoke about the programming course that I teach using a flipped class to our first years. I didn’t want to go in to any details about what a flipped learning environment is as I would most certainly not have been able to do it justice after Robert’s talk so I just gave an exemplar of it in practice. I might blog about the particular approach I used another time.

  • Toby Bailey then gave an excellent talk about the flipped classroom / peer instruction that he has in place for a large class.

    One of the highlights was certainly a video of his class in which we saw students respond to a question via in class clickers and then break in to groups of 2 to discuss the particular problem and finally answer the question one more time. Responses were then put up for all to see and it was great to see that students were indeed improving (you could see the distributions of clicker answers improve after the peer instruction).

    Here are a couple of other things I noted down during the talk:

    It’s not all about the lecturer.

    The importance of getting out of the way.

    Tell the class why you are doing it.

  • Stephen Rutherford then spoke about his flipped classroom in biosciences.

    This was a great talk and it was very neat to have a non mathematical point of view. My first highlight from Steve’s talk can be seen in the photo above. I think that that fundamental question (‘why am I better than a book’) could in fact help improve the instruction of many.

    A flipped classroom allows some control to be put in the hands of the students.

    The reason students are at university is to get an education and not necessarily a degree.

  • We then moved on to a relaxed panel discussion about the flipped classroom, one of the things that I think was a big highlight of that was the importance of involving students in the pedagogic reasoning behind whatever approach is used in a class.

  • The final ‘talk’ of the day was by Chris Sangwin who talked about his Moore Method class.

    This was a fascinating talk as Chris clearly described the success he has had with implementing a Moore Method class.

    In particular he highlighted the importance of he role of the instructor in this framework where students are given a set of problems to work through and present to their peers (there is no lecturing in a Moore method class).

    Some highlights:

    In 2007, after his class finished students found the book from which his problems originated and continued to work through them on their own.

    In 2008, students set up a reading group and started to read complex mathematical topics.

  • The rest of the conference was a natural continuation from Chris’s talk as Dana Ernst and TJ Hitchman spoke about Inquiry Based Learning (a pedagogic term that encompasses the Moore method - I’m sure someone can correct me if I got that subtlety wrong).

    This was a really great interactive session that ran over to Tuesday. There is far too much that happened in this session and it was hard to take notes as we were very much involved but here is some of the things that stuck for me.

    1. TJ ran a great first little session that basically got us to think about what we want to be as educators. One of the main things that came out of the ‘what do you want your students to remember in 20 years time’ question was that very few of us (I’m not sure if anyone did) mentioned the actual content of the courses we teach.
    2. The importance of creating a safe environment in which students can fail (in order to learn). Productive failure.
    3. The various difficulties associated with implementing an IBL approach due to class size (this was a recurring theme with regards to UK vs US class sizes).

    Another important point was the criteria that defines an IBL approach:

    Students are in charge of not only generating the content but also critiquing the content.

    You can find all of Dana and TJ’s content on their github repository.

After this session I enjoyed a good chat with TJ who helped me figure out how to make my R/SAS course better. After that my project student who will be working with me to evaluate my flipped classroom had a great talk with Robert, TJ and Dana who gave some really helpful advice. One of the highlights that came out of this was Robert putting very simply what I believe defines an effective pedagogic approach. Hopefully Robert will either correct me or forgive me for paraphrasing (EDIT: He has corrected me in the comments):

Whatever the approach: flipped classroom, IBL, interpretive dance, as long as the system allows you to empower your students and monitor how they are learning it is worth doing.

I’m probably forgetting quite a few details about the workshop (including the 6+ course conference dinner which was pretty awesome). Now to describe the next conference which was the Cardiff University Annual Learning and Teaching Conference

Wednesday: Cardiff University Annual Learning and Teaching Conference

This was my first time attending this conference and I was lucky enough to have my abstract accepted so I was able to give a talk.

You can find my slides here.

In all honesty I was kind of tired so I didn’t take as detailed notes as I would like and/or as many photos but here are some highlights:

  • I enjoyed Stephen Rutherford discussing the plans of the Biosciences school to bring in peer assessment:

    Assessment for learning and not of learning

  • I liked Anne Cunningham reminding everyone that students obtain content from a variety of sources when talking about their using of scoop.it:

    The curators are not just the staff. The prime curators are the students.

  • Rob Wilson and Nathan Roberts gave an overview of the tutoring system that we as a university will be heading towards.

A great three days

This brings my attendance at education conference to a total of 3 and I must say that I can’t wait to go to the next one (incidentally my abstract for the CETL-MSOR conference got accepted today). I really enjoy the vibe at education conferences as there is a slight sense of urgency in case anyone says anything that someone might be able to use/adapt/steal so as to improve their teaching and have a real impact on our students’ lives.

Two final links:

  • The #innovcardiff hashtag if you would like to see what was being said online about the innovation conference (big applause to Calvin Smith who did a tremendous job on there!);
  • The #cardiffedu hashtag if you would like to see the same for the Cardiff University education conference.

If anyone who attended the conferences has anything to add it would be great to hear from you and if anyone couldn’t make it but would like to know more: please get in touch :)

July 15, 2014 12:00 AM

July 14, 2014

Simon Spicer


I'm at the stage where my code essentially works: it does everthing I initially set out to have it do, including computing the aforementioned zero sums for elliptic curve $L$-functions. However, the code is written in pure Python, and it is therefore not as fast as it could be.

This is often not a problem; Python is designed to be easy to read and maintain, and I'm hoping that the Python code I wrote is both of those. If we were just planning to run it on elliptic curves with small coefficients - for example, the curve represented by the equation $y^2=x^3-16x+16$ - this wouldn't be an issue. Curves with small coefficients have small conductors and thus few low-lying zeros near the central point, which allows us to run the zero sum code on them with small Delta parameter values. A small Delta value means the computation will finish very quickly regardless of how efficiently it's implemented, so it probably wouldn't be worth my while trying to optimize the code in that case.

To illustrate this point, here is the first most high-level, generic version of the method that computes the sum $\sum_{\gamma} \text{sinc}^2(\Delta \gamma)$ over the zeros of a given elliptic curve $L$-function (minus documentation):

[Of course, there's plenty going on in the background here. I have a separate method, self.cn() which computes the logarithmic derivative coefficients, and I call the SciPy function spence() to compute the part of the sum that comes from the Fourier transform of the digamma function $\frac{\Gamma^{\prime}}{\Gamma}$. Nevertheless, the code is simple and straightforward, and (hopefully) it's easy to follow the logic therein.]

However, the whole point of my GSoC project is to produce code that can be used for mathematical research; ultimately we want to push computations as far as we can and run the code on elliptic curves with large conductor, since curves with small conductor are already well-understood. Ergo, it's time I thought about going back over what I've written and seeing how I can speed it up.

There are two distinct ways to achieve speedups. The first is to rewrite the code more cleverly to eliminates unnecessary loops, coercions, function calls etc. Here is a second version I have written of the same function (still in Python):

The major change I've made between the two versions is improving how the sum involving the logarithmic derivative coefficients is computed - captured in the variable y. In the first version, I simply iterated over all integers $n$ up to the bound $t$, calling the method self.cn() each time. However, the logarithmic derivative coefficient $c_n$ is zero whenever $n$ is not a prime power, and knowing its value for $n=p$ a prime allows us to efficiently compute its value for $p^k$ any power of that prime. It therefore makes sense to do everything "in-house": eliminate the method call to self.cn(), iterate only over primes, and compute the logarithmic derivative coefficients for all powers of a given prime together.

Let's see how the methods match up in terms of speed. Below we run the two versions of the zero sum method on the elliptic curve $E: y^2=x^3-16x+16$, which is a rank 1 curve of conductor 37:

sage: import sage.lfunctions.zero_sums as zero_sums
sage: ZS = zero_sums.LFunctionZeroSum(EllipticCurve([-16,16]))
sage: ZS._zerosum_sincsquared(Delta=1)
sage: ZS._zerosum_sincsquared_fast(Delta=1)
sage: %timeit ZS._zerosum_sincsquared(Delta=1)
10 loops, best of 3: 20.5 ms per loop
sage: %timeit ZS._zerosum_sincsquared_fast(Delta=1)
100 loops, best of 3: 3.46 ms per loop

That's about a sixfold speedup we've achieved, just by reworking the section of the code that computes the $c_n$ sum.

The downside, of course, is that the code in method version 2 is more complicated - and thus less readable - than that in version 1. This is often the case in software development: you can write code that is elegant and easy to read but slow, or you can write code that is fast but horribly complicated and difficult to maintain. And when it comes to mathematical programming, unfortunately, sometimes the necessity for speed trumps readability.

The second major way to achieve speedups is to abandon pure Python and switch to a more low-level language. I could theoretically take my code and rewrite it in C, for example; if done relatively intelligently I'm sure the resulting code would blow what I have here out the water in terms of speed. However, I have no experience writing C code, and even if I did getting the code to interface with the rest of the Sage codebase would be a major headache.

Thankfully there is a happy middle ground: Cython. Cython is a programming language - technically a superset of Python - that allows direct interfacing with C and C++ data types and structures. Code written in Cython can be as fast as C code and nearly as readable as pure Python. Crucially, because it's so similar to Python it doesn't require rewriting all my code from scratch. And Sage already knows how to deal with Cython, so there aren't any compatibility issues there.

I am therefore in the process of doing exactly that: rewriting my code in Cython. Mostly this is just a cut-and-paste job and is pleasingly hassle-free; however, in order to achieve the desired speedups, the bottleneck methods - such as our $\text{sinc}^2$ zero sum method above - must be modified to make use of C data types.

Here is the third, most recent version of the _zerosum_sincsquared() method for our zero sum class, this time written in Cython:

Definitely longer and uglier. I now must declare my (C) variables up front; previously Python just created them on the fly, which is nice but slower than allocating memory space for the variables a priori. I've eliminated the use of complex arithmetic, so that everything can be done using C integer and float types. I still iterate over all primes up to the bound $t$; however now I deal with those primes that divide the conductor of $E$ (for which the associated summand is calculated slightly differently) beforehand, so that in the main loop I don't have to check at each point if my prime $p$ divides the conductor or not [This last check is expensive: the conductor $N$ can be very large - too large to cast as a $C$ long long even - so we would have to use slower Python or Sage data types to represent it. Better to get rid of the check altogether].

Let's see how this version holds up in a speed test. The Cython code has already been built into Sage and the class loaded into the global namespace, so I can just call it without having to attach or load any file:

sage: ZS = LFunctionZeroSum(EllipticCurve([-16,16]))
sage: ZS._zerosum_sincsquared_fast(Delta=1)
sage: %timeit ZS._zerosum_sincsquared_fast(Delta=1)
1000 loops, best of 3: 1.72 ms per loop

The good news: the Cythonized version of the method produces the same output as the Python versions, and it's definitely faster. The bad news: the speedup is only about a factor of 2, which isn't hugely impressive given how much uglier the code is.

Why is this? Crucially, we still iterate over all integers up to the bound $t$, testing for primality at each step. This is very inefficient: most integers are not prime (in fact, asymptotically 0 percent of all positive integers are prime); we should be using sieving methods to eliminate primality testing at those integers which we know before checking are composite. For example, we should at the very least only ever iterate over the odd numbers beyond 3. That immediately halves the number of primality tests we have to do, and we should therefore get a comparable speedup if primality testing is what is dominating the runtime in this method.

This is therefore what I hope to implement next: rework the zero sum code yet again to incorporate prime sieving. This has applicability beyond just the $\text{sinc}^2$ method: all explicit formula-type sums at some point invoke summing over primes or prime powers, so having access to code that can do this quickly would be a huge boon.

by Simon Spicer (noreply@blogger.com) at July 14, 2014 12:37 PM

July 11, 2014

Nikhil Peter

Sage Android – UI Update

It’s been a busy week so far in the land of UI improvements. Disclaimer: I’m pretty bad at UI, so any and all suggestions are welcome. Firstly, last week’s problems have been resolved viz. Everything is saved nicely on orientation change, including the interacts, which required quite a bit of effort. In short, it involved […]

by hav3n at July 11, 2014 05:59 PM

July 08, 2014

Simon Spicer

The Birch and Swinnerton-Dyer Conjecture and computing analytic rank

Let $E$ be an elliptic curve with $L$-function $L_E(s)$. Recall that Google Summer of Code project is to implement in Sage a method that allows us to compute $\sum_{\gamma} f(\Delta \gamma)$, where $\gamma$ ranges over the imaginary parts of the nontrivial zeros of $L_E$, $\Delta$ is a given positive parameter, and $f(x)$ is a specified symmetric continuous integrable function that is 1 at the origin. The value of this sum then bounds the analytic rank of $E$ - the number of zeros at the central point - from above, since we are summing $1$ with multipliticy $r_{an}$ in the sum, along with some other nonzero positive terms (that are hopefully small). See this post for more info on the method.

One immediate glaring issue here is that zeros that are close to the critical point will contribute values that are close to 1 in the sum, so the curve will then appear to have larger analytic rank than it actually does. An obvious question, then, is to ask: how close can the noncentral zeros get to the central point? Is there some way to show that they cannot be too close? If so, then we could figure out just how large of a $\Delta$ we would need to use in order to get a rank bound that is actually tight.

The rank zero curve $y^2 = x^3 -  x^3 - 7460362000712 x - 7842981500851012704$ has an extremely low-lying zero at $\gamma = 0.0256$ (and thus another one at $-0.0256$; as a result the zero sum looks like it's converging towards a value of 2 instead of the actual analytic rank of zero. In order to actually get a sum value out that's less than one we would have to use a $\Delta$ value of about 20; this is far beyond what's feasible due to the exponential dependence of the zero sum method on $\Delta$.

The good news is that there is hope in this regard; the nature of low-lying zeros for elliptic curve $L$-functions is actually the topic of my PhD dissertation (which I'm still working on, so I can't provide a link thereto just yet!). In order to understand how close the lowest zero can get to the central point we will need to talk a bit about the BSD Conjecture.

The Birch and Swinnerton-Dyer Conjecture is one of the two Clay Math Millenium Problems related to $L$-functions. The conjecture is comprised of two parts; the first part I mentioned briefly in this previous post. However, we can use the second part to gain insight into how good our zero sum based estimate for analytic rank will be.

Even though I've stated the first part of the BSD Conjecture before, for completeness I'll reproduce the full conjecture here. Let $E$ be an elliptic curve defined over the rational numbers, e.g. a curve represented by the equation $y^2 = x^3 + Ax + B$ for some integers $A$ and $B$ such that $4A^3+27B^2 \ne 0$. Let $E(\mathbb{Q})$ be the group of rational points on the elliptic curve, and let $r_{al}$ be the algebraic rank of $E(\mathbb{Q})$. Let $L_E(s)$ be the $L$-function attached to $E$, and let $L_E(1+s) = s^{r_{an}}\left[a_0 + a_1 s + \ldots\right]$ be the Taylor expansion of $L_E(s)$ about $s=1$ such that the leading coefficient $a_0$ is nonzero; $r_{an}$ is called the analytic rank of $E$ (see here for more details on all of the above objects). The first part of the BSD conjecture asserts that $r_{al}=r_{an}$; that is, the order of vanishing of the $L$-function about the central point is exactly equal to the number of free generators in the group of rational points on $E$.

The second part of the conjecture asserts that we actually know the exact value of that leading coefficient $a_0$ in terms of other invariants of $E$. Specifically:
$$ a_0 = \frac{\Omega_E\cdot\text{Reg}_E\cdot\prod_p c_p\cdot\#\text{Ш}(E/\mathbb{Q})}{(\#E_{\text{Tor}}(\mathbb{Q}))^2}. $$

Fear not if you have no idea what any of these quantities are. They are all things that we know how to compute - or at least estimate in size. I provide below brief descriptions of each of these quantities; however, feel free to skip this part. It suffices to know that we have a formula for computing the exact value of that leading coefficient $a_0$.
  1. $\#E_{\text{Tor}}(\mathbb{Q})$ is the number of rational torsion points on $E$. Remember that the solutions $(x,y)$ to the elliptic curve equation $y^2 = x^3 + Ax+B$, where $x$ and $y$ are both rational numbers, form a group. Recall also that that the group of rational points $E(\mathbb{Q})$ may be finite or infinite, depending on whether the group has algebraic rank zero, or greater than zero. However, it turns out that there are only ever finitely many torsion points - those which can be added to themselves some finite number of times to get the group identity element. These points of finite order form a subgroup, denoted $E_{\text{Tor}}(\mathbb{Q})$, and the quantity in question is just the size of this finite group (squared in the formula). In fact, it's been proven that the size of $E_{\text{Tor}}(\mathbb{Q})$ is at most 16.
  2. $\Omega_E$ is the real period of $E$. This is perhaps a bit more tricky to define, but it essentially is a number that measures the size of the set of real points of $E$. If you plot the graph of the equation representing $E: y^2 = x^3 + Ax + B$ on the cartesian plane, you get something that looks like one of the following:

    The plots of the real solutions to four elliptic curves, and their associated real periods.

    There is a way to assign an intrinsic "size" to these graphs, which we denote the real period $\Omega_E$. The technical definition is that $\Omega_E$ is equal to the integral of the absolute value of the differential $\omega = \frac{dx}{2y}$ along the part of the real graph of $E$ that is connected to infinity (that or twice that depending on whether the cubic equation $x^3 + Ax + B$ has one or three real roots respectively).
  3. $\text{Reg}_E$ is the regulator of $E$. This is a number that measures the "density" of rational points on $E$. Recall that $E(\mathbb{Q}) \approx T \times \mathbb{Z}^{r_{an}}$, i.e there free part of $E(\mathbb{Q})$ is isomorphic to $r_{an}$ copies of the integers. There is a canonical way to embed the free part of $E(\mathbb{Q})$ in $\mathbb{R}^{r_{an}}$ as a lattice; the regulator $\text{Reg}_E$ is the volume of the fundamental domain of this lattice. The thing to take away here is that elliptic curves with small regulators have lots of rational points whose coefficients have small numerators and denominators, while curves with large regulators have few such points.
  4. $\prod_p c_p$ is the Tamagawa product for $E$. For each prime $p$, one can consider the points on $E$ over the $p$-adic numbers $\mathbb{Q}_p$. The Tamagawa number $c_p$ is the ratio of the size of the full group of $p$-adic points on $E$ to the subgroup of $p$-adic points that are connected to the identity. This is always a positive integer, and crucially, in all but a finite number of cases the ratio is 1. Thus we can consider the product of the $c_p$ as we range over all prime numbers, and this is precisely the definition of the Tamagawa product.
  5. $\#\text{Ш}(E/\mathbb{Q})$ is the order of the Tate-Shafarevich group of $E$ over the rational numbers. The Tate-Shafarevich group of $E$ is probably the most mysterious part of the BSD formula; it is defined as the subgroup of the Weil–Châtelet group $H^1(G_{\mathbb{Q}},E)$ that becomes trivial when passing to any completion of $\mathbb{Q}$. If you're like me then this definition will be completely opaque; however, we can think of $\text{Ш}(E/\mathbb{Q})$ as measuring how much $E$ violates the local-global principle: that one should be able to find rational solutions to an algebraic equation by finding solutions modulo a prime number $p$ for each $p$, and then piecing this information together with the Chinese Remainder Theorem to get a rational solution. Curves with nontrivial $\text{Ш}$ have homogeneous spaces that have solutions modulo $p$ for all $p$, but no rational points. The main thing here is that $\text{Ш}$ is conjectured to be finite, in which case $\#\text{Ш}(E/\mathbb{Q})$ is just a positive integer (in fact, it can be shown for elliptic curves that if $\text{Ш}$ is indeed finite, then its size is a perfect square).
Why is the BSD Conjecture relevant to rank estimation? Because it helps us overcome the crucial obstacle to computing analytic rank exactly: without extra knowledge, it's impossible to decide using numerical techniques whether the $n$th derivative of the $L$-function at the central point is exactly zero, or just so small that it looks like it is zero to the precision that we're using. If we can use the BSD formula to show a priori that $a_0$ must be at least $M_E$ in magnitude, where $M_E$ is some number that depends only on some easily computable data attached to the elliptic curve $E$, then all we need to do is evaluate successive derivatives of $L_E$ at the central point to enough precision to decide if that derivative is less than $M_E$ or not; this is readily doable on a finite precision machine. Keep going until we hit a derivative which is then definitely greater than $M_E$ in magnitude, at which we can halt and declare that the analytic rank is precisely the order of that derivative.

In the context of the explicit formula-based zero sum rank estimation method implemented in our GSoC project, the size of the leading coefficient also controls how far close the lowest noncentral zero can be from the central point. Specifically, we have the folling result: Let $\gamma_0$ be the lowest-lying noncentral zero for $L_E$ (i.e. the zero closest to the central point that is not actually at the central point); let $\Lambda_E(s) = N^{s/2}(2\pi)^s\Gamma(s)L_E(s)$ is the completed $L$-function for $E$, and let $\Lambda_E(1+s) = s^{r_{an}}\left[b_0 + b_1 s + b_2 s^2 \ldots\right]$ be the Taylor expansion of $\Lambda_E$ about the central point. Then:
$$ \gamma_0 > \sqrt{\frac{b_0}{b_2}}. $$
Thankfully, $b_0$ is easily relatable back to the constant $a_0$, and we have known computable upper bounds on the magnitude of $b_2$, so knowing how small $a_0$ is tells us how close the lowest zero can be to the central point. Thus bounding $a_0$ from below in terms of some function of $E$ tells us how large of a $\Delta$ value we need to use in our zero sum, and thus how long the computation will take as a function of $E$.

If this perhaps sounds a bit handwavy at this point it's because this is all still a work in progress, to be published as part of my dissertation. Nevertheless, the bottom line is that bounding the leading $L$-series coefficient from below gives us a way to directly analyze the computational complexity of analytic rank methods. 

I hope to go into more detail in a future post about what we can do to bound from below the leading coefficient of $L_E$ at the central point, and why this is a far from straightforward endeavour.

by Simon Spicer (noreply@blogger.com) at July 08, 2014 06:04 PM

July 06, 2014

David Horgan (quantumtetrahedron)

Sagemath21: Knot theory calculations

This week I have been working on a variety of topics, but I’d like to post about some sagemath knot theory investigations I‘ve been doing.  This is connected to one of my longer term aims, which  is to  perform SU(2) recoupling calculations based on  Temperley-Lieb algebras using Sage Mathematics Software. Essentially I would like to to write a set […]

by David Horgan at July 06, 2014 07:23 PM

Vince Knight

My CSV python video gets 10000 views

So the other day I got the following notification on my phone:

This both terrified me and made me smile. It’s nice to think that a bunch of people have (hopefully) been helped with what I put together but also slightly worrying as I think I would be able to put together a much better video if I did it now.

Here it is:

The basic code I use in the video is:

import csv

out = open('data.csv', 'r')  # Open a file in read mode
data = csv.reader(out)  # Initiate a csv reader object which will parse the data
data = [[row[0], eval(row[1]), eval(row[2])] for row in data]  # Read in the data and convert certain entries of each row
out.close()  # Close the file

new_data = [[row[0], row[1] + row[2]] for row in data]  # Create some new data

out = open('new_data.csv', 'w')  # Open a file in write mode
output = csv.writer(out)  # Initiate a csv writer object which will write the data in the correct format (csv)

for row in new_data:  # Loop through the data
    output.writerow(row)  # Write the row to file

out.close()  # Close the file

There are a couple of other videos that are getting near that landmark, this is possibly my favourite of them:

The above video shows how to simulate a basic queue in Python. My Summer student James Campbell and I are working on something related at the moment.

July 06, 2014 12:00 AM

July 02, 2014

Vince Knight

A gitignore file for Python (Sage) and LaTeX

The main tools I use on a day to day basis are Python/Sage and LaTeX. Since learning to love git and even more so github my usual workflow when starting a new project is something like this:

$ git init
$ vim proof_that_there_are_a_finite_number_of_primes.py
$ vim application_for_fields_medal.tex
$ latexmk --xelatex application_for_fields_medal.tex
$ git status

Once I run the git status that’s when I realise that I’ve now got a bunch of files I don’t want to commit .aux, .pyc etc…

I then normally throw whatever those files are to a .gitignore file. Despite teaching my students to never do anything twice and to always script whatever can be scripted: I have never actually put a .gitignore file together that has everything in it.

So here’s my first attempt.

This is basically a modification of the Python github .gitignore and a slight tweak of this gist (there doesn’t seem to be a bespoke LaTeX .gitignore as an option when you create a repository).

# LaTeX Files

# Byte-compiled / optimized / DLL files

# C extensions

# Distribution / packaging

# Installer logs

# Unit test / coverage reports

# Translations

# Mr Developer

# Rope

# Django stuff:

# Sphinx documentation

You can find this at this gist.

If anyone can suggest anything I’m missing it’d be great to hear it :)

July 02, 2014 12:00 AM

June 28, 2014

Nikhil Peter

GSoC – Post Mid-Term Update

Well, Mid-Terms have just finished(I passed!), and I’ve been neglecting this blog so here are some updates. The app is now in beta! More info can be found at the sage-devel post here Most of the work done is internal, but there are a few external fixes and enhancements as well. Agendas for the next […]

by hav3n at June 28, 2014 05:41 PM

June 27, 2014

Vince Knight

Pointing at a blog post about Sage and a treasure hunt

It looks like I might be moving... I've been playing with jekyll + github and really love the workflow so it looks like I might be moving this blog over there.

It's not official yet as I still need to think about a couple of things but in the meantime here's a post I just wrote over there about how the team I was in used Sage in a mathematical treasure hunt: http://goo.gl/iLgGnL

by Vincent Knight (noreply@blogger.com) at June 27, 2014 12:10 PM

Vince Knight

Using Sage in a Treasure hunt

Here is my first post using jekyll, I’ll see how this goes and potentially migrate my blog completely to here.

This past year I’ve had a very minor role helping to organise the EURO Summer Institute for Healthcare. This took part from June 11 to 20 and it was really cool with 20 (ish) young academics (from first year PhD student to early years of Post Doc) coming together to present a paper they’re working on and get bespoke feedback from peers and an expert in the field.

The academic activities are really valuable but arguably of more value is the fact that everyone comes together and has a good time. One such ‘good time’ was a treasure hunt that because of some people dropping out of I was lucky enough to take part in. This was really great fun and in this post I’ll describe the problems as well as the Sage code my team used to almost win!

Here’s a photo that Paul Harper took of Pieter, Omar, Jenny and I huddled around Sage:

The treasure hunt involved us running around the small village to find an envelope with a particular mathematical problem. Once we’d solved that problem we would be given an indication as to where our next problem was.

Here are the various challenges (again: thanks to Paul for taking the photos!):

A clinical pathway problem (we did this one by hand):

A queueing problem:

To solve the above we wrote down the Markov chain (that’s actually the picture that Pieter is working on in the first photo of this blog post). Here’s the transition matrix corresponding to the Markov Chain:

\[\begin{pmatrix} -\frac{4_5}{45} & \frac{4}{45} & 0 & 0 & 0 \\
\frac{1}{20} & -\frac{5}{36} & \frac{1}{45} & \frac{1}{15} & 0 \\
0 & \frac{1}{10} & -\frac{1}{6} & 0 & \frac{1}{15} \\
0 & \frac{1}{20} & 0 & -\frac{13}{180} & \frac{1}{45} \\
0 & 0 & 0 & \frac{1}{10} & -\frac{1}{10} \end{pmatrix}\]

At the very beginning of the treasure hunt the organisers encouraged everyone to go and get their laptops. We originally thought that we’d manage without but at this point I ran to my room to go get my laptop :)

Here’s how we obtained the steady state probability \(\pi\):

Q = matrix([[-4/45,4/45,0,0,0,1],
pi = Q.solve_left(vector([0,0,0,0,0,1]))

The above solves the matrix equation \(\pi Q=0\) with an extra column in \(Q\) to ensure that the elements of \(\pi\) sum to 1.

Finally, the particular questions asked for the following weighted sum (corresponding the mean utilisation):


This gave a value of about \(0.499\).

I’m obviously missing out a couple of details (the actually description of the state space) but I’ll leave that as an exercise.

A chinese postman problem:

I’m not entirely sure what we did here as Pieter kind of ran with this one but at some point I was asked to calculate a matching on our graph. Here’s the Sage code:

sage: K = 500  # The inbuilt matching algorithm aim to maximise: we wanted to minimise so I threw a large constant in...
sage: G = Graph( { 1: {2: -23+K, 5:-20+K, 8:-19+K},
....:              2: {1: -23+K, 3:-8+K, 5:-18+K},
....:              3: {2: -8+K, 4:-5+K, 5:-16+K},
....:              4: {3: -5+K, 5:-14+K, 7:-7+K, 10:-21+K},
....:              5: {1: -20+K, 2:-18+K, 3:-16+K, 4:-14+K, 6:-1+K},
....:              6: {5: -1+K, 7:-8+K, 8:-20+K, 9:-12+K, 10:-17+K},
....:              7: {4: -7+K, 6:-8+K},
....:              8: {1: -19+K, 6:-20+K, 9:-15+K, 11:-20+K},
....:              9: {6:-12+K, 8:-15+K, 10:-14+K, 11:-12+K},
....:              10: {4:-21+K, 6:-17+K, 9:-14+K, 11:-18+K},
....:              11: {8:-20+K, 9:-12+K, 10:-18+K},
....:               }, sparse = True)
sage: G.matching()
[(1, 8, 481), (2, 3, 492), (4, 7, 493), (5, 6, 499), (9, 11, 488)]

I’m not entirely sure that’s the right Graph I was supposed to use but it’s what I have left over in my Sage notebook…

A packing problem:

I had remembered seeing that packing problems were implemented in Sage so as we were running back from collecting the clue I yelled: this will take 3 minutes!

Sadly, this wasn’t the case as the only implementation involves bins of the same size (which isn’t the case here). As I read around the docs the rest of my team was able to solve this heuristically which left us with the last challenge and at this point nicely in the lead!

The problem of doom that made everyone cry:

After having pulled in the data here is about as far as Sage got us:

p = list_plot(healthy, color='red', size=30)
p += list_plot(pathological, color='green',size=30)

This was a tricky problem. We had no wifi in our hotel so downloading a relevant R package was out of the question.

We eventually formulated a non linear integer program but sadly our solver didn’t seem able to handle it in time. With two minutes to go (the deadline was dinner) one of the teams who had been doing things very steadily ran over claiming that they might have a solution. Everyone went quiet and walked over as their solution was checked and verified. It was a really nice moment as everyone clapped and cheered. Here’s a picture of a bunch of us crowded around the solution understanding how they took a stab at fixing some of the variables so that the solver could get the solution.

This was a phenomenal piece of fun. You can find another description with a bunch of funny photos of us all struggling here.

If anyone has some suggestions to how we could have solved any problems any faster I’d be delighted to hear them :)

June 27, 2014 12:00 AM

June 26, 2014

Simon Spicer

Fun with elliptic curve L-function explicit formulae

Although I gave a brief exposition of the "baseline" explicit formula for the elliptic curve $L$-functions in a previous post, I wanted to spend some more time showing how one can use this formula to prove some useful statements about the elliptic curves and their $L$-functions. Some of these are already implemented in some way in the code I am building for my GSoC project, and I hope to include more as time goes on.

First up, let's fix some notation. For the duration of this post we fix an elliptic curve $E$ over the rationals with conductor $N$. Let $L_E(s)$ be its associated $L$-function, and let $\Lambda_E(s)$ be the completed $L$-function thereof, i.e.
$\Lambda_E(s) = N^{s/2}(2\pi)^s\Gamma(s)L_E(s)$, where $\Gamma$ is the usual Gamma function on $\mathbb{C}$. If you want to know more about how $E$, $L_E$ and $\Lambda_E$ are defined and how we compute with them, some of my previous posts (here and here) go into their exposition in more detail.

Let's recap briefly how we derived the baseline explicit formula for the $L$-function of $E$ (see this post for more background thereon). Taking logarithmic derivatives of the above formula for $\Lambda_E(s)$ and shifting to the left by one gives us the following equality:
$$\frac{\Lambda_E^{\prime}}{\Lambda_E}(1+s) = \log\left(\frac{\sqrt{N}}{2\pi}\right) + \frac{\Gamma^{\prime}}{\Gamma}(1+s) + \frac{L_E^{\prime}}{L_E}(1+s).$$
Nothing magic here yet. However, $\frac{\Lambda_E^{\prime}}{\Lambda_E}$, $\frac{\Gamma^{\prime}}{\Gamma}$ and $\frac{L_E^{\prime}}{L_E}$ all have particularly nice series expansions about the central point. We have:

  • $\frac{\Lambda_E^{\prime}}{\Lambda_E}(1+s) = \sum_{\gamma} \frac{s}{s^2+\gamma^2}$, where $\gamma$ ranges over the imaginary parts of the zeros of $L_E$ on the critical line; this converges for any $s$ not equal to a zero of $L_E$.
  • $\frac{\Gamma^{\prime}}{\Gamma}(1+s) = -\eta + \sum_{k=1}^{\infty} \frac{s}{k(k+s)}$, where $\eta$ is the Euler-Mascheroni constant $=0.5772156649\ldots$ (this constant is usually denoted by the symbol $\gamma$ - but we'll be using that symbol for something else soon enough!); this sum converges for all $s$ not equal to a negative integer.
  • $\frac{L_E^{\prime}}{L_E}(1+s) = \sum_{n=1}^{\infty} c_n n^{-s}$; this only converges absolutely in the right half plane $\Re(s)>\frac{1}{2}$.

$$ c_n = \begin{cases}
\left[p^m+1-\#E(\mathbb{F}_{p^m})\right]\frac{\log p}{p^m}, & n = p^m \mbox{ is a perfect prime power,} \\
0 & \mbox{otherwise.}\end{cases} $$

Assembling these equalities gives us the aforementioned explicit formula:
$$ \sum_{\gamma} \frac{s}{s^2+\gamma^2} = \left[-\eta+\log\left(\frac{\sqrt{N}}{2\pi}\right)\right] + \sum_{k=1}^{\infty} \frac{s}{k(k+s)} + \sum_n c_n n^{-s}$$
which holds for any $s$ where all three series converge. It is this formula which we will use repeatedly to plumb the nature of $E$.

For ease of exposition we'll denote the term in the square brackets $C_0$. It pitches up a lot in the math below, and it's a pain to keep writing out!

Some things to note:

  • $C_0 = -\eta+\log\left(\frac{\sqrt{N}}{2\pi}\right)$ is easily computable (assuming you know $N$). Importantly, this constant depends only on the conductor of $E$; it contains no other arithmetic information about the curve, nor does it depend in any way on the complex input $s$.
  • The sum $\sum_{k=1}^{\infty} \frac{s}{k(k+s)}$ doesn't depend on the curve at all. As such, when it comes to computing values associated to this sum we can just hard-code the computation before we even get to working with the curve itself.
  • The coefficients $c_n$ can computed by counting the number of points on $E$ over finite fields up to some bound. This is quick to do for any particular prime power.

Good news: the code I've written can compute all the above values quickly and efficiently:

sage: E = EllipticCurve([-12,29])
sage: Z = LFunctionZeroSum(E)
sage: N = E.conductor()
sage: print(Z.C0(),RDF(-euler_gamma+log(sqrt(N)/(2*pi))))
(2.0131665172, 2.0131665172)
sage: print(Z.digamma(3.5),RDF(psi(3.5)))
(1.10315664065, 1.10315664065)
sage: Z.digamma(3.5,include_constant_term=False)
sage: Z.cn(389)
sage: Z.cn(next_prime(1e9))
sage: timeit('Z.cn(next_prime(1e9))')
625 loops, best of 3: 238 µs per loop

So by computing values on the right we can compute the sum on the left - without having to know the exact locations of the zeros $\gamma$, which in general is hard to compute.

Now that we have this formula in the bag, let's put it to work.


If we multiply the sum over the zeros by $s$ and letting $\Delta = 1/s$, we get
$$\sum_{\gamma} \frac{\Delta^{-2}}{\Delta^{-2}+\gamma^2} = \sum_{\gamma} \frac{1}{1+(\Delta\gamma)^2},$$
Note that for large values of $\Delta$, $\frac{1}{1+(\Delta\gamma)^2}$ is small but strictly positive for all nonzero $\gamma$, and 1 for the central zeros, which have $\gamma=0$. Thus the value of the sum when $\Delta$ is large gives a close upper bound on the analytic rank $r$ of $E$. That is, we can bound the rank of $E$ from above by choosing a suitably large value of $\Delta$ and computing the quantity on the right in the inequality below:
$$r < \sum_{\gamma} \frac{1}{1+(\Delta\gamma)^2} = \frac{1}{\Delta}\left[C_0 + \sum_{k=1}^{\infty} \frac{1}{k(1+\Delta k)} + \sum_n c_n n^{-1/\Delta}\right]. $$
Great! Right? Wrong. In practice this approach is not a very good one. The reason is the infinite sum over $n$ only converges absolutely for $\Delta<2$, and for Delta values as small as this, the obtained bound won't be very good. A value of $\Delta=2$, for example, gives us the zero sum $\sum_{\gamma} \frac{1}{1+(2\gamma)^2}$. If a curve's $L$-function has a zero with imaginary part about 0.5, for example - as many $L$-functions do - then such a zero will contribute 0.5 to the sum. And because zeros come in symmetric pairs, the sum value will then be at least a whole unit larger than the actual analytic rank. In general, for $\Delta<2$ the computed sum can be quite a bit bigger than the actual analytic rank of $E$.

Moreover, even though the Generalized Riemann Hypothesis predicts that the sum $\sum_n c_n n^{-1/\Delta}$ does indeed converge for any positive value of $\Delta$, in practice the convergence is so slow that we end up needing to compute inordinate amounts of the $c_n$ in order to hope to get a good approximation of the sum. So no dice; this approach is just too inefficient to be practical.

A graphic depiction of the convergence problems we run into when trying to evaluate the sum over the $c_n$. For the elliptic curve $E: y^2 + y = x^3 - 79*x + 342$ the above plot shows the cumulative sum $\sum_{n<T} c_n n^{-1/4}$ for $T$ up to 100000; this corresponds to $\Delta=4$. Note that even when we include this many terms in the sum, its value still varies considerably. It's unlikely the final computed value is correct to a single decimal digit yet.


We can get around this impasse by evaluating a modified sum over the zeros, one which requires us to only ever need to compute finitely many of the $c_n$. Here's how we do it. Start with the sum $\sum_{\gamma} \frac{s}{s^2+\gamma^2}$, and divide by $s^2$ to get $\frac{1}{s(s^2+\gamma^2)}$. Now hearken back to your college sophomore math classes and take the inverse Laplace transform of this sum. We get
$$ \mathcal{L}^{-1} \left[\sum_{\gamma} \frac{1}{s(s^2+\gamma^2)}\right](t) = \sum_{\gamma} \mathcal{L}^{-1} \left[\frac{1}{s(s^2+\gamma^2)}\right](t) = \frac{t^2}{2} \sum_{\gamma} \left(\frac{\sin(\frac{t}{2}\gamma)}{\frac{t}{2}\gamma}\right)^2 $$
Letting $\Delta = \frac{t}{2\pi}$ we get the sum
$$ \mathcal{L}^{-1} \left[\sum_{\gamma} \frac{s}{s^2+\gamma^2}\right](\Delta) = 2\pi^2\Delta^2 \sum_{\gamma} \text{sinc}^2(\Delta\gamma), $$
where $\text{sinc}(x) = \frac{\sin(\pi x)}{\pi x}$, and $\text{sinc}(0) = 1$.

Note that as before, $\text{sinc}^2(\Delta\gamma)$ exaluates to 1 for the central zeros, and is small but positive for all nonzero $\gamma$ when $\Delta$ is large. So again, this sum will give an upper bound on the analytic rank of $E$, and that this bound converges to the true analytic rank as $\Delta \to \infty$.

If we do the same - divide by $s^2$ and take inverse Laplace transforms - to the quantities on the right, we get the following:

  • $\mathcal{L}^{-1} \left[\frac{C_0}{s^2} \right] = 2\pi\Delta C_0;$
  • $ \mathcal{L}^{-1} \left[\sum_{k=1}^{\infty} \frac{1}{sk(k+s)}\right] = \sum_{k=1}^{\infty} \frac{1}{k^2}\left(1-e^{-2\pi\Delta k}\right);$
  • $\mathcal{L}^{-1} \left[ \sum_{n=1}^{\infty} c_n \frac{n^{-s}}{s^2} \right] = \sum_{\log n<2\pi\Delta} c_n\cdot(2\pi\Delta - \log n). $
Note that the last sum is finite: for any given value of $\Delta$, we only need to compute the $c_n$ for $n$ up to $e^{2\pi\Delta}$. This is the great advantage here: we can compute the sum exactly without introducing any truncation error. The other two quantities are also readly computable to any precision.

Combining the above values and dividing by $2\pi^2\Delta^2$, we arrive at the rank bounding formula which is implemented in my GSoC code, hideous as it is:
r < \sum_{\gamma} \text{sinc}^2(\Delta\gamma) =  \frac{1}{\pi\Delta} &\left[ C_0 + \frac{1}{2\pi\Delta}\sum_{k=1}^{\infty} \frac{1}{k^2}\left(1-e^{-2\pi\Delta k}\right) \right. \\
&\left. + \sum_{n<e^{2\pi\Delta}} c_n\left(1 - \frac{\log n}{2\pi\Delta}\right) \right] \end{align*}
Of course, the number of terms needed on the right hand side is still exponential in $\Delta$, so this limits the tightness of the sum we can compute on the left hand side. In practice a personal computer can compute the sum with $\Delta=2$ in a few seconds, and a more powerful cluster can handle $\Delta=3$ in a similar time. Beyond Delta values of about 4, however, the number of $c_n$ is just too great to make the sum effectively computable.


Explicit formula-type sums can also be used to answer questions about the distribution of zeros of $L_E$ along the critical line. Let $T$ be a positive real number, and $M_E(T)$ be the counting function that gives the number of zeros in the critical strip with imaginary part at most $T$ in magnitude. By convention, when $T$ coincides with a zero of $L_E$, then we count that zero with weight. In other words, we can write
$$ M_E(T) = \sum_{|\gamma|<T} 1 + \sum_{|\gamma|=T} 1/2.$$
There is a more mathematically elegant way to write this sum. Let $\theta(x)$ be the Heaviside function on $x$ given by
$$ \theta(x) = \begin{cases} 0, & x<0 \\
\frac{1}{2}, & x=0 \\
1, & x>0. \end{cases}$$
Then we have that
$$M_E(T) = \sum_{\gamma} \theta(T^2-\gamma^2). $$
We have written $M_E(T)$ as a sum over the zeros of $L_E$, so we expect that it comprises the left-hand-side of some explicit formula-type sum. This is indeed this the case. Using Fourier transforms we can show that
$$M_E(T) = \frac{2}{\pi}\left[C_0 T + \sum_{k=1}^{\infty} \left(\frac{T}{k} - \arctan \frac{T}{k}\right) + \sum_{n=1}^{\infty} \frac{c_n}{\log n}\cdot \sin(T\log n) \right] $$

With this formula in hand we can start asking questions on how the zeros of $L_E$ are distributed. For example, how does average zero density on the critical line scale as a function of $T$, the imaginary part of the area in the critical strip we're considering? How does zero density scale with the curve's conductor $N$?

If one assumes the Generalized Riemann Hypothesis, we can show that $\sum_{n=1}^{\infty} \frac{c_n}{\log n}\cdot \sin(T\log n) = O(\log T)$. Moreover, this sum is in a sense equidistributed about zero, so we expect its contribution for a sufficiently large randomly chosen value of $T$ to be zero. The other two quantities are more straightforward. The $C_0 T$ term is clearly linear in $T$. Going back to the definition of $C_0$, we see that $C_0 T = \frac{1}{2}T\log N + $constant$\cdot T$. Finally, we can show that the sum over $k$ equals $T\log T + O(\log T)$. Combining this gives us the estimate
$$ M_E(T) = \frac{1}{\pi} T (\log N + 2\log T+a) + O(\log T) $$
for some explicitly computable constant $a$ independent of $E$, $N$ or $T$ . That is, the number of zeros with imaginary part at most $T$ in magnitude is "very close" to $\frac{1}{\pi}T(\log N + 2\log T)$. Another way to put it is than the number of zeros in the interval $[T,T+1]$ is about $\frac{1}{2}\log N + \log T$.

The value of $M_E(T)$ for $T$ up to 30, for the elliptic curve $E: y^2 + y = x^3 - x$. The black line is just the 'smooth part' of $M_E(T)$, given by $\frac{2}{\pi}\left(C_0 T + \sum_{k=1}^{\infty} \left(\frac{T}{k} - \arctan \frac{T}{k}\right)\right)$. This gives us the 'expected number of zeros up to $T$', before we know any data about $E$ other than its conductor. The blue line is what we get when we add in $\sum_{n=1}^{\infty} \frac{c_n}{\log n}\cdot \sin(T\log n)$, which tells us the exact locations of the zeros: they will be found at the places where the blue curve jumps by 2.

Note that the sum over $n$ converges *very* slowly. To produce the above plot, I included terms up to $n=1 000 000$, and there is still considerable rounding visible in the blue curve. If I could some how evaluate the $c_n$ sum in all its infinite glory, then resulting plot would be perfectly sharp, comprised of flat lines that jump vertically by 2 at the loci of the zeros.

These are two of the uses of explicit formula-type sums in the context of elliptic curve $L$-functions. If you want to read more about them, feel free to pick up my PhD thesis - when it eventually gets published!

by Simon Spicer (noreply@blogger.com) at June 26, 2014 04:23 PM

June 19, 2014

Simon Spicer

The structure of my GSoC code and a short demo

I'm at the point where I can explain the layout of the code that I've written so far, and even demonstrate some of its functionality.

As mentioned in previous posts, the initial goal of my Google Summer of Code project is to implement in Sage functionality that will allow us to bound the analytic rank of an elliptic curve $E$ via certain explicit formula-type sums relating to the $L$-function of $E$. One could certainly achieve this by simply writing a method on the existing elliptic curve class in Sage. However, the machinery we'll be using to bound rank is useful outside of the context of elliptic curves: it allows us to compute  sums over zeros of any $L$-function that comes from a modular form. In fact, so long as

  1. The $L$-function has an Euler product expansion over the primes;
  2. The Euler product coefficients - akin to the $a_p$ values we've mentioned previously - are readily computible; and
  3. The $L$-function obeys a functional equation which allows us to define and compute with the function inside of the critical strip;

then we can use this explicit-formula methodology to compute sums over the zeros of the $L$-function.

All the code is hosted on GitHub, so you can view it freely here (note: this repository is a branch of the main Sage repo, so it contains an entire copy of the Sage source. I'm only adding a tiny fraction of code in a few files to this codebase; I'll link to these files directly below).

To this effect, it makes sense to write a new class in Sage that can ultimately be made to compute zero sums for any motivic $L$-function. This is what I'll be doing. I've created a file zero_sums.py in the sage/lfunctions/ directory, inside of which I'll write an family of classes that take as input a motivic object, and contain methods to compute various sums and values related to the $L$-function of that object.

I'll start by writing two classes: LFunctionZeroSum_abstract, which will contain all the general methods that can apply to a zero sum estimator attached to any motivic $L$-function. The second class, LFunctionZeroSum_EllipticCurve, inherits from the abstract class, and contains the code specific to elliptic curve $L$-functions.

I have also added a method to the EllipticCurve_rational_field class in the sage/schemes/elliptic_curves/ell_rational_field.py file (the class modeling elliptic curves over the rationals) to access the analytic rank estimation functionality of the zero sum estimator.

Let's see the code in action. To download a copy yourself, say from within a project in cloud.sagemath.com, open up a terminal and type

~$ git clone git://github.com/haikona/GSoC_2014.git

This will download the code in the repo into a new folder GSoC_2014 in your current directory. CD into that directory, type make and hit enter to build. This will unfortunately take a couple hours to complete, but the good new is you'll only need to do that once. Alternatively, if you have an existing freshly-built copy of the sage source and good github-fu, you should be able to download just the relevant code and apply it, greatly speeding up the build time.

Once your copy of sage containing the new LFunctionZeroSum code is built, fire up that copy of sage (e.g. type ./sage while in the GSoC_2014 directory; don't just type sage, as this runs the system-wide copy of sage - not the one we want). This command-line interface will have all the functionality of any other current copy of Sage, but with the extra methods and classes I've written.

Let's start by estimating the rank of some elliptic curves:

sage: E = EllipticCurve([0,1,1,-2,0]); E
Elliptic Curve defined by y^2 + y = x^3 + x^2 - 2*x
over Rational Field
sage: E.rank(), E.conductor()
(2, 389)
sage: E.analytic_rank_bound()

In the code above, I've defined an elliptic curve given by the Weierstrass equation $y^2 +y = x^3+x^2-2x$; for those of you in the know, this is the curve with Cremona label '389a', the rank 2 curve with smallest conductor. I've then printed out the rank of the curve, which is indeed 2. Then I've called my analytic_rank_bound() method on $E$, and it returns a value which is strictly greater than the analytic rank of $E$. Since the analytic rank of an elliptic curve is always a non-negative integer, we can conclude that the analytic rank of $E$ is at most 2.

How long did this rank estimation computation take? Thankfully Sage has a timing function to answer this question:

sage: timeit('E.analytic_rank_bound()')
125 loops, best of 3: 3.33 ms per loop

Neat; that was quick. However, this elliptic curve has a small conductor - 389, hence it's Cremona label - so computing with its $L$-function is easy. The great thing with the zero sum code is that you can run it on curves with really large conductors. Here we reproduce the commands above on the curve $y^2 + y = x^3 - 23737x + 960366$, which has rank 8 and conductor $N \sim 4.6\times 10^{14}$:

sage: E = EllipticCurve([0,0,1,-23737,960366]); E
Elliptic Curve defined by y^2 + y = x^3 - 23737*x + 960366 
over Rational Field
sage: E.rank(), E.conductor()
(8, 457532830151317)
sage: E.analytic_rank_bound()
sage: timeit('E.analytic_rank_bound()')
125 loops, best of 3: 3.37 ms per loop

So we see then that this zero sum code doesn't care very much about the size of the conductor of $E$; this is one of its huge plusses. There are of downsides, of course - but we'll save those for another post.

Now let's look at the LFunctionZeroSum code.

sage: E = EllipticCurve([23,-100])
sage: Z = LFunctionZeroSum(E); Z
Zero sum estimator for L-function attached to 
Elliptic Curve defined by y^2 = x^3 + 23*x - 100 
over Rational Field

Right now we don't have too much beyond that which you can access from the EllipticCurve class. For one, we can compute the coefficients of the logarithmic derivative, since they are needed to compute the rank-estimating zero sum. We can also return and/or compute one or two other values associated to the zero sum class. As time goes on we'll flesh out more functionality for this class.

sage: for n in range(1,11):
....:     print(n,Z.logarithmic_derivative_coefficient(n))
(1, 0)
(2, 0.0)
(3, -1.09861228867)
(4, 0.0)
(5, 1.28755032995)
(6, 0)
(7, -0.277987164151)
(8, 0.0)
(9, -0.366204096223)
(10, 0)
sage: Z.elliptic_curve()
Elliptic Curve defined by y^2 = x^3 + 23*x - 100
over Rational Field
sage: Z.elliptic_curve().rank()
sage: Z.rankbound(Delta=1.7,function="gaussian")

All of this code is of course written to Sage spec. That means that every method written has documentation, including expected input, output, examples, and notes/warnings/references where necessary. You can access the docstring for a method from the command line by typing that method followed by a question mark and hitting enter.

All my code will initially be written in Python. Our first goal is to replicate the functionality that has already been written by Jonathan Bober in c. As such we're not aiming for speed here; instead the focus is to make sure all the code is mathematically correct.

Once that's up and running there are two natural directions to take the project:

  1. Speed the code up. Rewrite key parts in Cython, so that computations run faster, and thus analytic rank can be estimated for curves of larger conductor.
  2. Generalize the code. Write classes that model zero sum estimators for $L$-functions attached to eigenforms (those modular forms with Euler product expansions) of any level and weight, and allow for more types of sums to be computed. Ultimately write code that can be used to compute sums for any motivic $L$-function.

I'm going to attack speeding up the code first, since the elliptic curve rank estimation functionality will probably see the most use. If there's time at the end of the project I can look to generalizing the code as much as I can.

by Simon Spicer (noreply@blogger.com) at June 19, 2014 02:48 PM

The explicit formula and bounding analytic rank from above

Wednesday's post gave some detail on how one can numerically evaluate $L_E(s)$, the $L$-function attached to $E$, at any given complex input $s$. This is certainly the first step in determining the analytic rank of $E$ i.e. the order of vanishing of the $L_E(s)$ at the central point $s=1$.

However, we immediately run into two major issues. The first, as mentioned previously, is that we have no proven lower bound on the size of the coefficients of the Taylor expansion of $L_E(s)$ at $s=1$, so we can never truly ascertain if the $n$th derivative there vanishes, or is just undetectibly small.

Today's post focuses on a method that at least partially overcomes the second issue: that evaluating $L_E(s)$ and its derivatives takes at least $O(\sqrt{N})$ time, where $N$ is the conductor of $E$. This means that working with elliptic curve $L$-functions directly becomes computationally infeasible if the curve's conductor is too big.

To see how we can bypass this problem, we'll have to talk about logarithmic derivatives. The logarithmic derivative of a function  $f(s)$ is just
$$ \frac{d}{ds} \log(f(s)) = \frac{f^{\prime}(s)}{f(s)}. $$
Logarithmic derivatives have some nice properties. For one, they turn products into sums: if $f(s) = g(s)h(s)$, then $\frac{f^{\prime}}{f} = \frac{g^{\prime}}{g} + \frac{h^{\prime}}{h}$. Because we can write $L_E(s)$ as an Euler product indexed by the primes, we therefore have that
$$ \frac{L_E^{\prime}}{L_E}(s) = \sum_p \frac{d}{ds} \log\left(\left(1-a_p p^{-s} + \epsilon_p p^{-2s}\right)^{-1}\right).  $$
When you multiply this out, you rather amazingly get a Dirichlet series whose coefficients can be described in a very mathematically elegant way:
$$ \frac{L_E^{\prime}}{L_E}(1+s) = \sum_{n=1}^{\infty} c_n n^{-s}, $$
$$ c_n = \begin{cases}
\left(p+1-\#E(\mathbb{F}_{p^m})\right)\frac{\log p}{p^m}, & n = p^m \mbox{a perfect prime power,} \\
0 & \mbox{otherwise.}\end{cases} $$

That is, the $n$th coefficient of $\frac{L_E^{\prime}}{L_E}(1+s)$ is $0$ when $n$ is not a prime power, and when it is, the coefficient relates directly to the number of points on the reduced curve over the finite field of $n$ elements. [Note we're shifting the logarithmic derivative over to the left by one unit, as this puts the central point at the origin i.e. at $s=0$.]

How does this yield a way to estimate analytic rank? Via something called the Explicit Formula for elliptic curve $L$-functions - which, confusingly, is really more of a suite of related methods than any one specific formula.

Remember the Generalized Riemann Hypothesis? This asserts, in the case of elliptic curve $L$-functions at least, that all nontrivial zeros of $L_E(s)$ lie on the line $\Re(s) = 1$, and two zeros can never lie on top of each other except at the central point $s=1$. The number of such zeros at the central point is precisely the analytic rank of $E$. Each nontrivial zero can therefore be written as $1\pm i\gamma$ for some non-negative real value of $\gamma$ - it turns out that whenever $L_E(1+i\gamma)=0$, then $L_E(1-i\gamma)=0$ too, so noncentral zeros always come in pairs. We believe GRH to be true with a high degree of confidence, since in the 150 years that $L$-functions have been studied, no-one has ever found a non-trivial zero off the critical line.

We can invoke GRH with logarithmic derivatives and something called the Hadamard product representation of an entire function to get another formula for the logarithmic derivative of $L_E(s)$:
$$ \frac{L_E^{\prime}}{L_E}(1+s) = -\log\left(\frac{\sqrt{N}}{2\pi}\right) - \frac{\Gamma^{\prime}}{\Gamma}(1+s) + \sum_{\gamma} \frac{s}{s^2+\gamma^2}. $$
Here $\frac{\Gamma^{\prime}}{\Gamma}(s)$ is the logarithmic derivative of the Gamma function (known as the digamma function; this is a standard function in analysis that we know a lot about and can easily compute), and $\gamma$ runs over the imaginary parts of the zeros of $L_E(s)$ on the critical strip $\Re(s)=1$.

What does this get us? A way to equate a certain sum over the zeros of $L_E(s)$ to other more tractable functions and sums. When you combine the two formulae for $\frac{L_E^{\prime}}{L_E}(1+s)$ you get the following equality:

$$ \sum_{\gamma} \frac{s}{s^2+\gamma^2} = \log\left(\frac{\sqrt{N}}{2\pi}\right) + \frac{\Gamma^{\prime}}{\Gamma}(1+s) + \sum_n \frac{c_n}{n} n^{-s}.$$
The key realization here is that while computing the zeros of $L_E$ is in general nontrivial and time-consuming, all the other quantities in the equation above are readily computable. We can therefore evaluate on the left by computing the quantities on the right to sufficient precision, none of which require knowledge of the exact locations of the zeros of $L_E(s)$.

The above is the first and perhaps most basic example of an explicit formula for elliptic curve $L$-functions, but it is not the one we want. Specificially, trouble still arises in the fact that when were still dealing with an infinite sum on the right, and when we look close to the central point $s=0$, convergence is unworkably slow. We can, however, work with related explicit formulae that reduce the sum on the right to a finite one.

The formula we'll use in our code can be obtained from the above equation by dividing both sides by $s^2$ and taking inverse Laplace transforms; we can also formulate it in terms of Fourier transforms. Specifically, we get the following. Recall the sinc function $\mbox{sinc}(x) = \frac{\sin(\pi x)}{\pi x}$ with $\mbox{sinc}(0)=1$. Then we have:
$$ \sum_{\gamma} \mbox{sinc}(\Delta\gamma)^2 = Q(\Delta,N) + \sum_{\log n<2\pi\Delta} c_n\cdot\left(2\pi\Delta-\log n\right), $$
where $Q(\Delta,N)$ is a certain quantity depending only on $\Delta$ and $N$ that is easy to compute.

Let's look at the left hand side. Since $\mbox{sinc}(\Delta\gamma)^2$ is non-negative for any $\gamma$, and $\mbox{sinc}(0)^2 = 1$, when we sum over all zeros $\gamma$'s we get a value which must be strictly greater than the number of zeros at the central point (i.e. the analytic rank of $E$). Moreover, as we increase $\Delta$, the contribution to the sum from all the noncentral zeros goes to zero, as $\mbox{sinc}(x) \to 0$ as $x \to \infty$.

A graphic representation of the sum on the left side of the equation for the $L$-function attached to the elliptic curve $y^2 = x^3 + 103x - 51$ for three increasing values of the parameter $\Delta$. Vertical lines have been plotted at $x=\gamma$ whenever $L_E(1+i\gamma) = 0$, and the height of each line is given by the black curve $\mbox{sinc}(\Delta x)^2$. Thus summing up the length of the vertical lines gives you the value of the sum on the left side of the equation. We see that as $\Delta$ increases, the contribution from the blue lines - corresponding to noncentral zeros - goes to zero, while the contribution from the central zeros in red remain at 1 apiece. Since there are two central zeros (plotted on top of each other here), the sum will limit to 2 from above as $\Delta$ increases.

Now consider the right hand side of the sum. Note that the sum over $n$ is finite, so we can compute it to any given precision without having to worry about truncation error. Since $Q(\Delta,N)$ is easy to compute, this allows us to evaluate the entire right hand side with (relatively) little fuss. We therefore are able, via this equality, to compute a quantity that converges to $r$ from above as $\Delta \to \infty$.

A graphical representation of the finite sum $\sum_{\log n<2\pi\Delta} c_n\cdot\left(2\pi\Delta-\log n\right)$ for the same curve, with $\Delta=1$. The black lines are the triangular function $y = \pm (2\pi\Delta-x)$, whose value at $\log n$ weight the $c_n$ coefficient. Each blue vertical line is placed at $x=\log n$, and has height $c_n\cdot\left(2\pi\Delta-\log n\right)$. Summing up the signed lengths of the blue lines gives the value of the sum over $n$. Note that there are only 120 non-zero terms in this sum when $\Delta=1$, so it is quick to compute.

This formula, and others akin to it, will form the core functionality of the code I will be writing and adapting to include in Sage, as they give us a way to quickly (at least when $\Delta$ is small) show that the analytic rank of a curve can't be too big.

There are some downsides, of course, most notable of which is that the number of terms in the finite sum on the right is exponential in $\Delta$. Tests show that on a laptop one can evaluate the sum in a few milliseconds for $\Delta=1$, a few seconds for $\Delta = 2$, but the computation takes on the order of days when $\Delta=4$. Nevertheless, for the majority of the elliptic curves we'll be looking at, $\Delta$ values of 2 or less will yield rank bounds that are in fact tight, so this method still has much utility.

Another issue that warrants investigation is to analyze just how fast - or slow - the convergence to $r$ will be as a function of increasing $\Delta$; this will allow us to determine what size $\Delta$ we need to pick a priori to get a good estimate out. But this is a topic for a future date; the next post or two will be dedicated toward detailing how I'm implementing this rank estimation functionality in Sage.

by Simon Spicer (noreply@blogger.com) at June 19, 2014 11:36 AM

June 12, 2014

William Stein

SageMathCloud Task Lists

I've added task list functionality to SageMathCloud (SMC), so you can keep track of a list of things to do related to a project, paper, etc. Task lists are saved as files on the filesystem, so they can be backed up in the usual way, automatically generated, etc. I doubt anybody is going to use SMC just for the tasks lists, but for people already using SMC in order to use Sage, write LaTeX documents, use IPython notebooks, etc., having a convenient integrated task list should come in handy.
To create a task list, in a project, click "+New", name the task list, then click the "Task List" button.  Here's what a task list looks like:

The Design

I used the task list quite a bit when implementing the task list, and significantly modified the interface many, many times. I also tried out numerous other todo list programs for inspiration. I eventually settled on the following key design choices, some of which are different than anything I've seen. In particular, the design targets highly technical users, which is not something I saw with any other todo list programs I tried.
  • Markdown editor: The task description is formatted using client-side rendered Github flavored markdown (using marked), including [ ] for checkboxes. I also include full MathJax support, and spent quite a while working around various subtleties of mixing mathjax and markdown. I'll be rewriting Sage worksheets to use this code. The editor itself is Codemirror 4 in markdown mode, so it respects whatever theme you choose, has nice features like multiple cursors, paren matching, vim/emacs modes, etc. Multiple people can edit the same task at once and everybody will see the changes as they are made (note: I haven't implemented showing other cursors.)
  • Relative dates and times: All dates and times are shown relative to right now. If something is due in 20 hours, it says "due about 20 hours from now". I also included a sortable column with the last time when a task was edited, also shown relative to the current time. This display uses the timeago jquery plugin. You can of course click on the due date to see the actual date.
  • Hashtags: As I was implementing (and removing) features such as priorities, a way to indicate which task you are currently working on, folders, etc, I decided that hashtags can provide every feature. Moreover, they are very text editor/programmer friendly. When editing a task, if you put #foo, then it is interpreted as a hashtag, which you can click on to show only tasks containing that tag. To disambiguate with markdown headings, to make a heading you have to include a whitespace, so # foo. I haven't added autocomplete for hashtags yet, but will. You can easily click on hashtags anywhere to select them, or use the bar at the top.
  • User-specified task order: The default sort order for tasks is custom. There is a drag handle so you can explicitly move tasks up and down in order to indicate how important they are to you (or whatever else). You can also click an up hand or down hand to knock the currently selected task to the bottom of the list of displayed tasks.
Of course, I still have an enormous list of functionality I would like to add, but that will have to wait. For example, I need to enable a chat associated to each task list, like the chats associated to Sage worksheets and other files. I also want to make it so one can select a range of tasks and copy them, move them, paste them into another list, etc. It would also be nice to be able to export task lists to HTML, PDF, etc., which should be fairly easy using pandoc.  I'm also considering making a note list, which is like a task list but without the due date or "done" box.  Because of all the infrastructure already there, it would be easy to add code evaluation functionality, thus creating something like worksheets, but from a different point of view (with maybe hashtags determining the Python process).

Databases and Differential Synchronization

One interesting thing I noticed when implementing task lists is that there are many similarities with the original sagenb.org design (and also IPython notebook), in which a worksheet is a list of cells that get evaluated, can be manually reordered, etc. Similarly, a task list is a list of "cells" that you edit, manually reorder, etc. With sagenb we had longstanding issues involving the order of each cell and assigning an integer numerical id (0, 1, 2, ...) to the cells, which resulted in things like cells seemingly getting randomly reordered, etc. Also, having multiple simultaneous viewers with automatic synchronization is very difficult with that model. For task lists, I've introduced some new models to address these issues.

A natural way to store a task list is in a database, and I initially spent some time coming up with a good database schema and implementing basic lists using Cassandra for the backend. However, I couldn't pursue this approach further, since I was concerned about how to implement realtime synchronization, and also about the lack of easily backing up complete task lists via snapshots, in git, etc. So instead I created an "object database" API built on top of a file that is synchronized across clients (and the server) using differential synchronization. The underlying file format for the database is straightforward -- there is one line in JSON format for each record in the database. When objects are changed, the file is suitably modified, synchronized to other clients, and events are triggered.

Since differential synchronization has no trouble with files that have "a few thousand lines", this approach works well for our purposes (since personal or shared todo lists are typically fairly short). Also, having one line per record is at least more git friendly than something like a sqlite database. I'm considering rewriting my implementation of IPython notebook sync on top of this abstraction.
Since I view the task list as a database, each task has a globally unique uuid. Also, instead of viewing the task order as being defined by an integer 0,1,2,3, which leads to all manner of bugs and programming misery, race conditions, etc., instead we view the order as being determined by floating point positions. So to insert a task between tasks with positions 1 and 2, we just give the task position 1.5.

by William Stein (noreply@blogger.com) at June 12, 2014 01:31 PM

June 11, 2014

Simon Spicer

How to compute with elliptic curve L-functions and estimate analytic rank

In a previous post I defined what an elliptic curve $L$-function $L_E(s)$ is, and indicated why the behaviour of the function at the central point $s=1$ is of such great interest: the order of vanishing of $L_E(s)$ at $s=1$ is conjecturally precisely equal to the algebraic rank of $E$.

Because this equality is still conjectural, we will cal the former quantity -- then number of derivitaves of $L_E(s)$ that vanish at $s=1$ - the analytic rank of $E$. The topic of this post is to address the question: given an elliptic curve $E$, how do we go about computing its analytic rank?

Before we can hope to answer this question, we need to know how to evaluate $L_E(s)$ itself for any given $s$. In the previous post I gave both the Euler product and Dirichlet series definitions for $L_E(s)$; to jog your memory, here's the Euler product of $L_E(s)$:
$$ L_E(s) = \prod_p \left(1-a_p p^{-s} + \epsilon_p p^{-2s}\right)^{-1}, $$
where the product runs over all prime numbers, $a_p = p+1-\#\{E(\mathbb{F}_p)\}$, and $\epsilon_p = 0$ if $p$ divides the conductor of $E$ and $1$ otherwise. The Dirichlet series is $L_E(s) = \sum_{n}a_n n^{-s}$, which is precisely what you get when you multiply out the Euler product. Note that we are suppresing the dependence on $E$ in both the $a_n$ and $\epsilon_p$ constants.

However, both the Euler product and Dirichlet series representations of $L_E(s)$ will only converge absolutely when the real part of $s$ exceeds $\frac{3}{2}$. Although the Sato-Tate Conjecture (now a theorem, but the name has stuck) implies that the expansions will in fact converge conditionally for $\Re{s}>\frac{1}{2}$, the convergence is so slow that attempting to evaluate $L_E(s)$ near the central point by multiplying or summing enough terms is horribly inefficient. As such, we need a better way to evaluate $L_E(s)$ - and thankfully, such better ways do indeed exit.

Remember how I mentioned in the previous post that $L$-functions obey a very nice symmetry condition? Well, here's that condition exactly: first, we need to define something called the completed $L$-function $\Lambda_E(s)$. This is just $L_E(s)$ multiplied by some extra factors. Specifically,
$$\Lambda_E(s) = N^{\frac{s}{2}}(2\pi)^{-s}\Gamma(s) L_E(s), $$
where $N$ is the conductor of $E$ and $\Gamma(s)$ is the usual Gamma function on $\mathbb{C}$ (the one that gives you $(n-1)!$ when you evaluate it at the positive integer $s=n$).

We can show that $\Lambda_E(s)$ is entire on $\mathbb{C}$; that is, it doesn't have any poles. Moreover, $\Lambda_E(s)$ obeys the glorious symmetry property
$$ \Lambda_E(s) = w_E \Lambda_E(2-s), $$
where $w_E$ is either $1$ or $-1$ and depends on $E$. This is called the functional equation for the $L$-function attached to $E$.

Another way to put it is that shifted completed $L$-function $\Lambda_E(1+s)$ is either an even or an odd function of $s$. Because the factors $N^{\frac{s}{2}}$, $(2\pi)^{-s}$ and $\Gamma(s)$ are all readily computable, this allows us to determine the value of $L_E(s)$ when the real part of $s$ is less than $\frac{1}{2}$.

What's left, then, is to figure out how to efficinetly evaluate $L_E(s)$ in the strip $\frac{1}{2} \le \Re(s) \le \frac{3}{2}$. This is called the critical strip for the $L$-function, and it is here that the behaviour of the function is most interesting.

[Aside: we can, for example, show that $\Lambda_E(1+s)$ is never zero outside of the critical strip. The Generalized Riemann Hypothesis in fact asserts that elliptic curve $L$-functions are only ever zero along the exact center of this strip, the critical line $\Re(s)=1$. We'll get back to the zeros of elliptic curve $L$-functions in a later post.]

To evaluate $\Lambda_E(s)$ in the critical strip, we make use of the modularity of $E$. The modularity theorem states that elliptic curves are modular: for every elliptic curve $E$ over the rationals there exists a weight 2 cuspidal eigenform $f_E$ of level $N$ (where $N$ is precisely the conductor of $E$), such that the $a_n$ of $E$ as defined previously equal the Fourier coefficients of $f_E$. If you haven't studied modular forms before, the crucial piece of knowledge is that there is a natural procedure for constructing $L$-functions from modular forms in such a way that the definition makes sense for all complex inputs $s$, and that the $L$-function attached to the cusp form $f_E$ will exactly equal the elliptic curve $L$-function $L_E(s)$. This is in fact how we show that elliptic curve $L$-functions can be analytically continued to all of $\mathbb{C}$.

The take-away is that via modularity there is an infinite sum representation for $L_E(s)$ which converges absolutely for all $s$. Here it is: define the auxiliary function
$$\lambda_E(s) = \left(\frac{\sqrt{N}}{2\pi}\right)^{s} \sum_{n=1}^\infty a_n n^{-s}\Gamma \left(s,\frac{2\pi}{\sqrt{N}}\cdot n\right), $$
where all the quantities are as defined previously, and $\Gamma(s,x)$ is the upper incomplete Gamma function (note that since $\Gamma(s,x)$ goes to zero exponentially as $x$ goes to infinity, this sum will converge absolutely for any $s$, with rate of convergence scaling with $\sqrt{N}$). Then we have
$$ \Lambda_E(s) = \lambda_E(s) + w_E \lambda_E(2-s). $$
Because we know how to compute incomplete Gamma functions, this gives us a way to evaluate $\Lambda_E(s)$, and thus $L_E(s)$, in the critical strip. The upside with this formula and variations thereof is that it works for any value of $s$ you stick in - including values near $s=1$. Similar formulae exist for the derivatives of $L_E(s)$, so we can in theory compute $L_E^{(n)}(1)$, the $n$th derivative of $L_E(s)$ at the central point, to any degree of accuracy for any $n$.

Thus if we want to compute a curve's analytic rank, what's stopping us from just evaluating successive derivatives of $L_E(s)$ at $s=1$ until we hit one which is not zero?

Two reasons. The first is that there's no way around the fact that you need about $\sqrt{N}$ terms to compute $L_E(s)$ or its derivatives to decent precision. If the conductor of the curve is too big, as is often the case, it takes an inordinate amount of time to simply evaluate the $L$-function near the central point. This makes direct evaluation impractical for all but the smallest-conductor curves -- and for those curves we can usually compute rank via other methods anyway.

The second reason is a more subtle one: how do you tell numerically if the $n$th derivative of $L_E(s)$ at $s=1$ is zero? If you think about it, it's easy to answer this question in one direction only: if you evaluate $L_E^{(n)}(1)$ to some precision and get digits that aren't all zero, then (assuming your code is correct) the $n$th derivative of $L_E(s)$ does not vanish. However, no matter how many digits of precision we compute getting all zeros, the possibility will always remain that the next digit along might not be zero.

In general, there is no numerical way to determine if the value of a black-box complex-valued function at a point is zero, or just really really close to zero. This is why, if you look in the literature, you'll find "methods to estimate analytic rank", but never to compute the quantity exactly. It's impossible to do without extra knowledge. Specifically, we'd need to have some sort of a computable lower bound on the size of the derivatives of $L_E(s)$ at the central. Unfortunately, no such theorems currently exist, so we're stuck with estimating analytic rank for now.

Thankfully, the $\sqrt{N}$-dependence issue is more hopeful. The next post will detail a method that provides good estimates for the analytic rank that scales much more slowly with the curve's conductor.

by Simon Spicer (noreply@blogger.com) at June 11, 2014 04:57 PM

Mathematical background: elliptic curves and L-functions

Day 1 of my Google Summer of Code project! I will try post updates at least once a week; I thought a good starting point would be to give an introduction to the mathematical objects that this project revolves around. The next post will then give a more detailed description of the project itself, and the structure of the code that I'm going to adapt, write and include in Sage.

To that end, for this post I will assume some  knowledge of elementary number theory, algebra and  complex analysis, but nothing more complicated than that.

Let $E$ be an elliptic curve over the rational numbers. We can think of $E$ as the set of rational solutions $(x,y)$ to a two-variable cubic equation in the form:
$$ E: y^2 = x^3 + Ax + B $$
for some integers $A$ and $B$, along with an extra "point at infinity". An important criterion is that the $E$ be a smooth curve; this translates to the requirement that the discriminant of the curve, given by $-16(4A^3+27B^2)$, is not zero.

One of the natural questions to ask when considering an elliptic curve is "how many rational solutions are there?" It turns out elliptic curves fall in that sweet spot where the answer could be zero, finitely many or infinitely many - and figuring out which is the case is a deeply non-trivial problem.

The rational solutions form an abelian group with a well-defined group operation that can be easily computed. By a theorem of Mordell, the group of rational points on an elliptic curve $E(\mathbb{Q})$ is finitely generated; we can therefore write
$$ E(\mathbb{Q}) \approx T \times \mathbb{Z}^r,$$
where $T$ is a finite group (called the torsion subgroup of $E$), and $r$ is denoted the algebraic rank of $E$.

The elliptic curve group law explained visually: three points in a straight line add to zero; because the point at infinity is the identity element, this means that the sum R of two points P and Q is the reflection about the real axis of the other point on the curve on the straight line connecting P and Q.

Determining the torsion subgroup of $E$ is a relatively straightforward endeavor. By a theorem of Mazur, rational elliptic curves have torsion subgroups that are (non-canonically) isomorphic to one of precisely fifteen possibilities: $\mathbb{Z}/n\mathbb{Z}$ for $n = 1$ through $10$ or $12$; or $\mathbb{Z}/2\mathbb{Z}\oplus \mathbb{Z}/2n\mathbb{Z}$ for $n = 1$ though $4$. Computing the rank $r$ - the number of independent rational points on $E$ - is the hard part, and it is towards this end that this project hopes to contribute.

Perhaps surprisingly, we can translate the algebraic problem of finding rational solutions to $E$ to an analytic one - at least conjecturally. To understand this we'll need to know what an elliptic curve $L$-function is. These are holomorphic functions defined on the whole complex plane that somehow encode a great deal of information about the elliptic curve they're attached to.

The definition goes as follows: for each prime $p$, count the number of solutions to the elliptic curve equation modulo $p$; we'll call this number $N_p(E)$. Then define the number $a_p(E)$ by
$$ a_p(E) = p+1 - N_p(E). $$
Hasse's Theorem states that $a_p(E)$ is always less that $2\sqrt{p}$ in magnitude for any $p$, and the Sato-Tate conjecure (recently proven by Taylor et al) states that for a fixed elliptic curve, the $a_p$ (suitably transformed) are asymptotically distributed in a semi-circular distribution about zero.

Next, for a given $p$ define the local factor $L_p(s)$ to be the function of the complex variable $s$ as follows:
$$ L_p(s) = \left(1-a_p(E)p^{-s} + \epsilon(p)p^{-2s}\right)^{-1}, $$
where $\epsilon(p)$ is 0 if $p$ divides the conductor of $E$, and 1 otherwise.

The conductor of $E$ is a positive integer that encodes the bad reduction of $E$: for a finite list of primes $p$, when we reduce the curve modulo $p$, we don't get a smooth curve but rather a singular one instead. The conductor is just the product of these primes to the first or second power (or in the cases $p=2$ or $3$, up to the eighth and fifth powers respectively). If you're unfamiliar with elliptic curves, the thing to note is that the conductor always divides the discriminant of $E$, namely the quantity $-16(4A^3+27B^2)$ mentioned previously.

Finally we can define the $L$-function attached to $E$:
$$ L_E(s) = \prod_p L_p(s) = \prod_p \left(1-a_p(E)p^{-s} + \epsilon(p)p^{-2s}\right)^{-1}. $$
The above representation of $L_E(s)$ is called the Euler product form of the $L$-function. If we multiply out the terms and use power series inversion we can also write $L_E(s)$ as a Dirichlet series:
$$ L_E(s) = \sum_{n=1}^{\infty} a_n(E) n^{-s}, $$
where for non-prime $n$ the coefficients $a_n$ are defined to be exactly the integers you get when you multiply out the Euler expansion.

If you do some analysis, using Hasse's bound on the size of the $a_p(E)$ and their distribution according to Sato-Tate, one can show that the above to representations only converge absolutely when the real part of $s$ is greater than $\frac{3}{2}$, and conditionally for $\Re(s)>\frac{1}{2}$. However, the modularity theorem states that these elliptic curve $L$-functions can actually be analytically continued to the entire complex plane. That is, for every elliptic curve $L$-function $L_E(s)$ as defined above, there is an entire function on $\mathbb{C}$ which agrees with the Euler product/Dirichlet series definition for $\Re(s)>1$, but is also defined - and readily computable - for all other complex values of $s$. This entire function is what we actually call the $L$-function attached to $E$.

The way we analytically continue $L_E(s)$ yields that the function is highly symmetric about the line $\Re(s)=1$; moreover, because the function is defined by real coefficients $L_E(s)$ also obeys a reflection symmetry along the real axis. The point $s=1$ is in a very real sense therefore the central value for the $L$-function. It thus makes sense to investigate the behaviour of the function around this point.

Because $L_E(s)$ is entire, it has a Taylor expansion at any given point. We can ask what the Taylor expansion of $L_E(s)$ about the point $s=1$ is, for example. One of the central unproven conjectures in modern-day number theory is the Birch and Swinnerton-Dyer Conjecture: that the order of vanishing of the Taylor expansions of $L_E(s)$ about the point $s=1$ is precisely $r$, the algebraic rank of the elliptic curve $E$. That is, if we let $z=s-1$ so that
$$ L_E(s) = c_0 + c_1 z + c_2 z^2 + \ldots $$
is the expansion of $L_E(s)$ about the central point, the BSD conjecture holds that $c_0$ through $c_{r-1}$ are all zero, and $c_r$ is not zero.

The values of three elliptic curve L-functions along the critical line $1+it$ for $-10<=1<=10$. Blue corresponds to the curve $y^2 = x^3 - 13392x - 1080432$, a rank 0 curve, red is that of $y^2 = x^3 - 16x + 16$, a rank 1 curve, and green is $y^2 = x^3 - 3024x + 46224$, a rank 2 curve. Note that close to the origin the graphs look like non-zero constant function, a straight line through the origin and a parabola respectively.

Thus if we can compute the order of vanishing of the elliptic curve $L$-function at the central point, we can at least conjecturally compute the rank of the curve. This converts an algebraic problem into a numerical one, which is perhaps more tractible.

The techniques we'll use to attempt to do this will be the subject of the next blog post. Unfortunately there are still plenty of challenges to this approach - the least of which boils down to: how do you numerically determine if the value of a function is zero, or just really, really close to zero? The short answer is that without theorems to back you up, you can't -- but we can still make considerable progress toward the problem of computing an elliptic curve's rank.

by Simon Spicer (noreply@blogger.com) at June 11, 2014 02:48 PM

June 10, 2014

Simon Spicer

How to develop for Sage using Sage Math Cloud
 and Git

I will be using Sage Math Cloud to write all the code for my Google Summer of Code project. The advantage of choosing the cloud-based route should be clear: you don’t need to install or store anything locally, and it frees you up to use any device with a web browser you choose to write code. The downside, as Sage Math Cloud is still new, is that tutorials and how-tos on the setup process in order to do so are a bit lacking. Since I couldn’t find any single unified source on how to do it, I thought I might to create a step-by-step guide on getting set up in Sage Math Cloud to write code in for inclusion in the Sage codebase.

Much of what I've done follows the Git the Hard Way tutorial for Sage development. However, there are significant departures. I recommend reading through that tutorial first; if you do then what follows below will be a piece of cake.

First, you’ll need an account at cloud.sagemath.com. Good news: the whole service is completely free to use, and there’s no annoying back-and-forthing via emailed confirmation links. It’ll take you two minutes tops.

The cloud.sagemath.com homepage.

Your work on your Sage Math Cloud account is project-based; each account can have a large number of projects associated to it. Each project acts like it’s own Linux-based file system inside of which you can create and edit any number of files of any type. Create a project in which you'll house your code.

The project root view, before any files have been created.

Next, you'll want a local installation of Sage. We're going to initiate a local copy of the Sage master repo, as hosted on GitHub. Click New --> Terminal to open up a new terminal window - we'll be doing everything via terminal commands. Then type the following three commands:

~$ git clone git://github.com/sagemath/sage.git

~$ cd sage

~/sage$ make

The first will download a copy of the Sage codebase into the new folder 'sage' (it should appear in the project root directory after the download is complete); the second changes into the directory, and the third initiated the compiling of the Sage source.
Note that the building of Sage usually takes about 3 hours, so get this going and then kick back and take inspiration from this.

While your code is compiling you'll want to make sure you have a GitHub account. If you don't follow the steps there to create one. While not required, hosting any projects on GitHub is highly recommended: it will allow multiple people to work on a project simultaneously, as well as giving you access to sophisticated revision control functionality that will prevent you from accidentally deleting code or data.

GitHub repo creation page.

Create an empty repository - we'll fill it with things later. So long as the repo exists and you know its name you're good to go.

The next thing we'll want to do is register the Sage Trac repository, where all the active tickets are hosted for Sage development. Once you're done creating an empty repo on your GitHub account, go back to your Sage Math Cloud project and wait for Sage to finish building. When this is done, in the terminal window type the following:

~/sage$ git remote add trac git@trac.sagemath.org:sage.git -t master

~/sage$ git remote -v

origin  git://github.com/sagemath/sage.git (fetch)

origin  git://github.com/sagemath/sage.git (push)

trac    git@trac.sagemath.org:sage.git (fetch)

trac    git@trac.sagemath.org:sage.git (push)

The git remote add command registers the Sage trac repo as a remote repository (trac is the name it is registered as locally; you can of course call it anything you want). The git remote -v command simply lists what repos you have now registered. We won't be using Sage trac for now. If you want more practice with trac, see the Git the Hard Way tutorial mentioned previously.

Note the inclusion of the word master in the command above; this means we will only track the master branch of the remote repo. A branch is an instanced copy of the codebase which you can work on and edit without altering other branches. The master branch should be kept pristine, so we will want to create and switch to a new branch. This is accomplished with the git checkout command:

~/sage$ git checkout -b demo

Switched to a new branch 'demo'

~/sage$ git branch



The -b parameter creates the branch, while checkout switches to that branch. Typing git branch then shows you what branches currently exist locally, as well as the one you're currently on.

We're now in the position to register the demo branch with our fresh repo on GitHub:

~/sage$ git remote add demo https://github.com/haikona/demo.git

~/sage$ git remote -v

demo    https://github.com/haikona/demo.git (fetch)

demo    https://github.com/haikona/demo.git (push)

origin  git://github.com/sagemath/sage.git (fetch)

origin  git://github.com/sagemath/sage.git (push)

trac    git@trac.sagemath.org:sage.git (fetch)

trac    git@trac.sagemath.org:sage.git (push)

And finally, we can push to our repo to sync the local copy therewith. For this you will need your GitHub password - so not just any old Harry can push to your repo.

~/sage$ git push demo demo
Username for 'https://github.com': haikona
Password for 'https://haikona@github.com':
Counting objects: 4, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 317 bytes | 0 bytes/s, done.
Total 3 (delta 1), reused 0 (delta 0)
To https://github.com/haikona/demo.git
   2b1d88b..21cddb8  demo -> demo

You're now synced and ready to start writing code! From hereon you should be able to follow one of the many tutorials on working with git; the hard part - setting up - is all done.

by Simon Spicer (noreply@blogger.com) at June 10, 2014 03:21 PM

June 05, 2014

William Stein

The Official Go Tutorial as a (41-page) SageMathCloud worksheet

Do you like using interactive SageMathCloud worksheets and want to learn the basics of the Go Programming language? I've added a %go magic to SMC worksheets, and translated the official Go tutorial into a single long worksheet.
  1. Open a SageMathCloud project and restart the project server if necessary (project settings --Restart).
  2. Click +New and paste in this URL (then press enter): https://github.com/sagemath/cloud-examples.git
  3. You'll get a large collection of example worksheets in a directory cloud-examples. Navigate to the "Go" subdirectory and open go.sagews.
You can also directly browse a PDF of the worksheet here: https://github.com/sagemath/cloud-examples/blob/master/go/go.pdf?raw=true

by William Stein (noreply@blogger.com) at June 05, 2014 06:42 PM

June 02, 2014

William Stein

Update to SageMathCloud - Codemirror 4.2, etc.

I've made the following updates to https://cloud.sagemath.com:

User interface changes (refresh your browser to get)

Sage Worksheets

  • Don't show spinner unless computation is taking more than a second.
  • Streamlined evaluation a little bit (never get stuck with the grey spinner when you're online)
  • 2d graphics now display using svg by default, since browser bugs have been fixed

Upgrade from Codemirror 3.20 to CodeMirror version 4.2, which is much better, faster, and has new features:

  • Multiple cursors (use control or command+click to create multiple cursors)
  • Sublime keyboard bindings
  • New editor features (you can turn these off in account settings):
    • Auto brackets: automatically close brackets, quotes, etc.
    • Show trailing whitespace: show spaces at ends of lines
Here's a 2-minute screencast that illustrates some of the above UI features: http://youtu.be/ykb12MGHOuk

Task Lists: There is now a preliminary task list implementation. To use it, make a file that ends in .tasks.

  • Task editing now uses full screen width
  • Fixed task deletion bug
  • Markdown list autocomplete

Backend changes (restart project server to get):

  • Automatically delete project log entries when the log grows beyond 7500 lines. In some cases of very active projects, the log would get enormous (say 30MB+) and just loading it would slow down everything for a while.
  • Clear a certain logfile that was getting huge whenever the project server is restarted.

by William Stein (noreply@blogger.com) at June 02, 2014 11:07 AM

May 31, 2014

Nikhil Peter

Sage Android Week 2-Making JSON parsing less painful,the Gson way

The second week of GSoC involved a lot of staring at JSON replies, discarded code and unecessary classes, but I am reasonably pleased with the final product.

The main agenda during the week was to redo the JSON parsing code(which currently uses the default Android implementation) with a serializer/deserializer, in this case Gson. Hence, the main requirements were:

  • Write the model Java classes corresponding the the replies from the Sage server.
  • Add appropriate getters,setters and utility functions to said classes.
  • Basic testing.

Why use Gson ?

Consider a  JSON reply from the server, in this case the status reply from the server.

"content": {
"execution_state": "busy"
"msg_id": "42c63936-be03-47a3-9892-b48af8246a33",
"parent_header": {
"msg_id": "92d67417-ded1-4460-a852-9903d392c50d",
"session": "ae7cc4cb-8139-455a-b041-b4ffa44a8882",
"username": "",
"msg_type": "execute_request"
"header": {
"msg_id": "42c63936-be03-47a3-9892-b48af8246a33",
"username": "kernel",
"date": "2014-05-31T08:05:17.951510",
"session": "b86fbd1b-2a1c-456b-b745-504ad2f79f2f",
"msg_type": "status"
"metadata": {},
"msg_type": "status"

Now normal java code for obtaining, for example, the msg_id in “header” field would look like


JSONObject response = new JSONObject(str); //str contains the response from server in string format.

String msg_id= response.getJSONObject("header").getString("msg_id");

}catch(JSONException e){

Which works passably well in most cases, but there are a few caveats.

  1. We have to catch what is in most cases, a useless exception since all JSON operations throw JSONException.
  2. If the nesting is deep, we will have to chain a lot of method calls
  3. We are using raw strings (“foo”) everywhere, high risk of human errors like spelling mistakes.
  4. This is too much work to obtain one value, and even more when considering that this is one of the simplest replies from the server.

Now, using GSON, we could do the following:


class StatusReply{

private String msg_id;
private String msg_type;
private Header header;
private Header parent_header;
private MetaData metadata;
private StatusContent content;

// +Getters and Setters

public static class Header{
private String msg_id;
private String session;
private String username;
private String msg_type;

//+Getters and Setters

public static class StatusContent{
private String execution_state;

//+Getters and Setters

public static class MetaData{
//Empty for now


And then we simply call:

 StatusReply reply = new Gson().fromJson(str,StatusReply.class);
String msg_id=reply.getHeader().getMessageID();

And that’s basically it.

That one line converts the entire JSON string into a POJO(Plain old Java object). As we can see it is far more concise,useful and pleasing to the eye.  No more raw Strings floating around, and no more unecessary exception handling.

Now the Sage server request & replies are fairly consistent in some fields but is wildly different in others.This presented a design challenge wherein I had to find common elements amongst the reply to group in a base class and extend the other, individual replies from this base class. Due to the inherent differences in the various replies, I feel I was only partially successful in doing so,  but with some tweaking, I managed to bring down the number of model classes from an initial 30ish(!) to a much more manageable 13.

The only major trouble I had was with  the reply to an @interact input:

    "content": {
    "data": {
    "application\/sage-interact": {
    "new_interact_id": "17bc0888-c7ed-4aad-877d-b230aca49943",
    "controls": {
    "n": {
    "update": true,
    "raw": true,
    "control_type": "slider",
    "display_value": true,
    "values": [
    "default": 0,
    "range": [
    "subtype": "discrete",
    "label": "n",
    "step": 1
    "readonly": false,
    "locations": null,
    "layout": [
    "text\/plain": "Sage Interact"
    "source": "sagecell"
    "msg_id": "ad34a0c6-8109-4314-85e7-698ba6704bbf",
    "parent_header": {
    "msg_id": "92d67417-ded1-4460-a852-9903d392c50d",
    "session": "ae7cc4cb-8139-455a-b041-b4ffa44a8882",
    "username": "",
    "msg_type": "execute_request"
    "header": {
    "msg_id": "ad34a0c6-8109-4314-85e7-698ba6704bbf",
    "username": "kernel",
    "date": "2014-05-31T08:05:17.955924",
    "session": "b86fbd1b-2a1c-456b-b745-504ad2f79f2f",
    "msg_type": "display_data"
    "metadata": {},
    "msg_type": "display_data"

We can see that the msg_type,msg_id,header,parent_header and content fields all remain unchanged, but the remaining are vastly different. However, deserializing these with Gson is still relatively simple, except the one part,viz the “n” object. Now, this key is dynamic, since it depends on the variables the user has used in their input, and so it’s not really possible to make a class for this. However, Gson provides a solution to this problem in the form of a custom deserializer.  A few(~150) lines of code later, and this was done.

What seems easy in writing, was not so in actual implementation.There are a number of ways I could have gone about this(and I tried most of them) and this was the only way which seemed natural and did not rely on hacks to obtain the reply.This took most of 4 days to do, and I finalized the classes & methods in the last two days.

I’ve never needed to write a custom deserializer before, so that was an exciting and informative challenge.


Why did I just add a bunch of classes which currently serve not much purpose other than to serve as a template for the JSON replies ?

  • We’ve now separated the design from the main code logic. If the Sage server responses ever change in the future, all we have to do is change the appropriate fields in the models, with little to no change in the main code.
  • It much faster(to write and to process) JSON using a deserializer/serializer. Gson isn’t the fastest,(it is in fact, one of the slowest) that distinction belongs to Jackson, which is a whopping 3 orders of magnitude faster, depending on the type of data to be parsed. Here I chose Gson because:
    • I’m more familiar with it
    • It is arguably more popular and widespread than Jackson, and is used in a wider variety of Open Source libraries and projects.
    • It’s maintained by Google(and is also open source). Nuff said.
  • It removes unecessary processing and functions from the main logic, which otherwise have no other purpose other than to just parse the responses.

Future Agenda

Next week’s goals include:

  • Replace the default HttpClient(Android has either the Apache-based client or the Java one, both of which I personally find too cumbersome) by something better and easier to use. This “something” is I STILL haven’t got around to finalizing, but there are a few options:
    1. Volley: Debuted in Google I/O last year, highly memory efficient(since it’s designed by Google) but lacking in everything else: documentation, proper support etc etc.
    2. Retrofit: One of the most popular libraries for HTTP requests, the only problems being I haven’t personally used this before and it’s annotation based, which I’m not 100% sure about.
    3. Other misc (mostly) asynchronous libraries: Things which come to mind are async-http,Ion,RoboSpice etc etc.
  • Replace the current websocket library. This is something i HAVE decided on, it’s gonna be AndroidAsync, which is pretty popular and well maintained(the author is one of the lead developers at CyanogenMod).
  • Use this and Gson together and remove all the older code.

I hope I can get this done in a week, but since I am reasonably ahead at the moment w.r.t my proposal, I do have some extra time available.

I’ll have a chat with Volker about the library and hopefully together we can come to a decision on what to use.

Thats all from me, thanks for reading!

by hav3n at May 31, 2014 01:51 PM

May 21, 2014

Nikhil Peter

GSoC 2014 – The First Week

It’s the first week of GSoC, and so I decided that maybe I should stop procrastinating and update this blog.

Here we go then!

Agendas during this week include:

  • Get familiar with the remaining code(Mostly done)
  • Refactor the code into packages so it’s more readable and developer friendly.
  • Gradle Migration:
    • The problem with migrating to gradle is that the entire project structure would have to be changed, which would make it incompatible with Eclipse. For the time being, I’ve decided to keep it Gradle and Eclipse compatible, which will involve some duplication of effort, but nothing too major.
  • Error Reporting:ACRA
    • The choice of error reporting on the client side is usually limited to two choices, Google’s own Analytics, which is more metric centric than crash-report oriented, or ACRA, which is much more simple to use and implement.
    • Having used ACRA before, it was my natural choice…however there are some limitations, especially for an Open Source project such as this:
      1. The error reports must be sent somewhere. If the project was closed source, the natural choice would be to send it to the dev’s email address or to some script accessible to the developer.
      2. Ideally we would need something like OctoCrash ,which could directly post the crash as a GitHub issue. Unfortunately, this is only for iOS, so it’s not an option for us.
  • Currently work is mostly on the CellDialog which adds a new cell. Features to be added here include:
    • More intuitive group selection
    • Add an option to delete cells from the CellListFragment directly(also implement multiple deletions), without having to go into the SageActivity each time to do so.
    • Currently, the only option is to edit the cells(apart from the options in overflow menu), so it would be better to have the delete and edit options in a CAB(Contextual Action Bar), to better conform to Android Guidelines.

by hav3n at May 21, 2014 08:05 AM

May 15, 2014

David Horgan (quantumtetrahedron)

Enhanced by Zemanta

This week I have  continued working on the spectrum of the Ricci operator for a monochromatic 4 – valent node dual to an equilateral tetrahedron. This blog post reports on work in progress,

In my last post I indicated how far I had got in porting the code for the monochromatic 4-valent node from Mathematica to sagemath. This is essentially complete now and I have just got to output a graph of eigenvalues of curvature versus spin.

Sagemath code for the spectrum of the Ricci Operator for a monochromatic 4-valent node dual to an equilateral  tetrahedron

The curvature is written as a combination of the length of a hinge and the deficit angle around it.



As yet the code is unoptimised. Below is a  sample of the output so far:



So now the eigenvalues have been found I can start to make use of them in calculations of the Hamiltonian Constraint Operator.



Eigenenvalent vs spin for 4 valent

Enhanced by Zemanta

by David Horgan at May 15, 2014 09:20 PM

May 11, 2014

Vince Knight

Wizards, Giants, Linear Programming and Sage

I've been playing a game on my phone for a couple of months now called Clash of clans. It's one of those freemium games that tells me to do something on my phone every now and then:

In essence during the game you build your base with buildings to obtain resources, create troops and defend against attacks.

Here's a screen shot of my base at the moment:

As well as building your base, you can also put together a raiding party and attacking other bases.
In this post, I'll describe the use of a mathematical technique called linear programming so as to get the best combination of Giants, Wizards etc in a raiding party.

To train a warrior costs Elixir (a resource that you mine and/or plunder) but also costs space in your army camps.

Here are the various warriors available to me at the moment (as you build better buildings you get more warriors etc...).

1. Barbarian (Damage per second: 18, Hitpoints: 78, Housing Space: 1, Elixir Cost: 80):

2. Archer (Damage per second: 16, Hitpoints: 33, Housing Space: 1, Elixir Cost: 160)

3. Goblin (Damage per second: 19, Hitpoints: 36, Housing Space: 1, Elixir Cost: 60)

4. Giant (Damage per second: 24, Hitpoints: 520, Housing Space: 5, Elixir Cost: 2000) 

 5. Wall Breaker (Damage per second: 32, Hitpoints: 35, Housing Space: 2, Elixir Cost: 2500)

6. Balloon (Damage per second: 72, Hitpoints: 280, Housing Space: 5, Elixir Cost: 3500)

7. Wizards (Damage per second: 125, Hitpoints: 130, Housing Space: 4, Elixir Cost: 3000)

8. Healer (Damage per second: NA, Hitpoints: 600, Housing Space: 14, Elixir Cost: 6000)

9. Dragon (Damage per second: 140, Hitpoints: 1900, Housing Space: 20, Elixir Cost: 25000)

To summarise the information about the players I'm going to put the costs in a vector \(c\) and the housing space in a vector \(h\) (indexed using the same order as the images above):


Currently I have a capacity of 50 housing spaces per camp and I have 4 camps so I have an upper bound on the total amount of housing space of \(K=200\).

If I let the vector \(x\) denote a raiding party so that \(x_i\) is the number of warriors of type \(\i) (using the ordering above) then we have the corresponding housing space constraint:

\[\sum_{i=1}^9h_ix_i\leq K\]

I can now try and do two things:
  1. Get a full set of camps by minimising the overall cost.
  2. Get the most powerful raiding party that can fit in my camps.
I will solve both of these problems using what is called Integer Linear Programming. The integer part of that is due to the fact that I must have whole parts of warriors.

  1. Minimising my costs

The cost of a given rading party \(x\) is given by:

\[C(x) = \sum_{i=1}^9c_ix_i \]

Thus to minimise this cost and fill the camps we need to solve the following minimisation problem:

\[\text{minimise }C(x) = \sum_{i=1}^9c_ix_i\]

such that:

\[H(x)=\sum_{i=1}^9h_ix_i= K\]

How do we solve this? I've blogged about a similar type of optimisation problem that I use to schedule my class assessment and for that I used the awesome sagemath. You can read that blog post here. Here I will do the same thing.

The following Sage code is what is needed to create the linear program and also obtain the solution:

# Set up parameters:
K = 200 # Set the upper bound
c = [80,160,60,2000,2500,3500,3000,6000,25000] # The the cost vector
h = [1,1,1,5,2,5,4,14,20] # Set the 'housing space' constraints vector
n = len(c) # Get the number of warriors

C = lambda x: sum([x[i]*c[i] for i in range(n)]) # Define the cost function
H = lambda x: sum([x[i]*h[i] for i in range(n)]) # Define the housing capacity function

# Create and solve the optimisation problem
p = MixedIntegerLinearProgram(maximization=False) # Create an optimisation problem with minimization as the goal
x = p.new_variable(integer=True) # Create a variable with the requirement that it must be integer
p.add_constraint(H(x)==K) # Set the housing constranit
p.set_objective(C(x)) # Set the objective function
p.show() # Show the optimisation problem
p.solve() # Solve the optimisation problem

# Solve the optimisation problem

print 'Has solution:' 
for i, v in p.get_values(x).iteritems(): 
    print 'x_%s = %s' % (i, int(round(v))) 
print 'Housing spaces used: %s' % H(p.get_values(x))

The output is (disappointing but expected): \(x=(0,0,200,0,0,0,0,0)\) which implies that I should have 200 Goblins.

If you're familiar with the game this is not a very sensible way to go as Goblins are really only good at getting loot from defenceless bases.

We can add a couple more constraints so that we don't have too many goblins (let's limit it to 10) and also so that we have at least one healer:

p = MixedIntegerLinearProgram(maximization=False)  # Create an optimisation problem with minimization as the goal
x = p.new_variable(integer=True) # Create a variable with the requirement that it must be integer
p.add_constraint(x[2]<=10) # Don't want more than 10 goblins
p.add_constraint(x[7]>=1) # Want 1 healer at least
print 'Has solution:'
for i, v in p.get_values(x).iteritems():
print 'x_%s = %s' % (i, int(round(v)))
print 'Housing spaces used: %s' % H(p.get_values(x))

This gives us \(x=(176,0,10,0,0,0,0,1,0)\).

This is again not very satisfying as we're basically filling our raiding party with the cheapest options. That was the purpose of this approach though. 

The next thing we'll look at is how to actually get a good raiding part.

  1. Getting the 'best' raiding party.

As you can see on the images, every warrior is different: some fly, some are better against buildings, some have more hitpoints than others etc...

One way of getting the 'best' raiding party could be to maximise the damage per second of the raiding party. To do this, let's define the following vector (which is just the damage per second that can be read off of the images above):

\[d=(18, 16, 19, 24, 32, 72, 125, 0, 140)\]

Now our maximisation problem can be written as:

\[\text{maximise } \sum_{i=1}^9d_ix_i\]

such that:

\[H(x)=\sum_{i=1}^9h_ix_i\leq K\]

Here's the Sage code to do this:

damagepersecond = [18, 16, 19, 24, 32, 72, 125, 0, 140]
C = lambda x: sum([x[i]*damagepersecond[i] for i in range(n)])

p = MixedIntegerLinearProgram() # Create an optimisation problem with minimization as the goal
x = p.new_variable(integer=True) # Create a variable with the requirement that it must be integer
p.add_constraint(x[2]<=10) # Don't want more than 10 goblins
p.add_constraint(x[7]>=1) # Want 1 healer at least
print 'Has solution:'
for i, v in p.get_values(x).iteritems():
print 'x_%s = %s' % (i, int(round(v)))
print 'Housing spaces used: %s' % H(p.get_values(x))

The result of the above is \(x=(0,0,2,0,0,0,46,1,0)\). I.e take 2 goblins, 46 wizards and 1 healer.
The problem with this raiding party is that wizards are actually quite fragile (archers towers and cannons can pretty much just pick them off).

So perhaps instead of maximising the damage per second we could maximise the total hit points of the party.

To do this we need to define the vector of hitpoints:

\[p=(78, 33, 36, 520, 35, 280, 130, 600, 1900)\]

Now our maximisation problem can be written as:

\[\text{maximise } \sum_{i=1}^9p_ix_i\]

such that:

\[H(x)=\sum_{i=1}^9h_ix_i\leq K\]

This gives us \(x=(0,0,0,0,93,0,0,1,0)\): a raiding part of wall breakers.

The problem with this is that wall breakers are good for just that: breaking walls. They carry a bomb up to a wall and then they explode. So let's add some more constraints to our optimisation problem so as to get something a bit more useful.
  • Let's make sure that we always have at least 20 housing capacity worth of distance attackers (Archers or Wizards):
\[x_1h_1+x_6h_6 \geq 20 \]
  • Let's ensure that we have at least 1 air attacker (so that once the air defenses are out of the way we can finish off the base):
  • Let's ensure that we always have some wall breakers to clear the walls for our ground units (1 per 5 giants and 1 per 20 barbarians say):
\[5x_4\geq x_3\]

\[20x_4\geq x_0\]
    The code to solve this is given here:

    p = MixedIntegerLinearProgram()  # Create an optimisation problem with minimization as the goal
    x = p.new_variable(integer=True) # Create a variable with the requirement that it must be integer
    p.add_constraint(H(x) <= K)
    p.add_constraint(x[2] <= 10) # Don't want more than 10 goblins
    p.add_constraint(x[7] >= 1) # Want 1 healer at least
    p.add_constraint(x[1] * h[1] + x[6] * h[6]>=20) # Want at least 20 capacity worth of distance attackers
    p.add_constraint(x[5] + x[8]>=1) # Want at least 1 of air attackers
    p.add_constraint(5 * x[4] >= x[3]) # Want at least 1 wall breaker per 5 giants
    p.add_constraint(20 * x[4] >= x[0]) # Want at least 1 wall breaker per 20 barbarians
    print 'Has solution:'
    for i, v in p.get_values(x).iteritems():
    print 'x_%s = %s' % (i, int(round(v)))
    print 'Housing spaces used: %s' % H(p.get_values(x))

    This gives \(x=(1,20,0,23,5,0,0,1,2)\): so take 1 Barbarian, 20 Archers, 23 Giants, 5 Wall Breakers, 1 Healer and 2 Dragons.

    The above would give us the most hit points but now we're missing out on the damage power of wizards.

    Perhaps we could try and balance both hitpoints damage.

    In essence we have what's called a multi objective optimisation problem. One approach to solve this is to simply weight both objectives.

    We're going to use \(\alpha\) to denote the importance of the hitpoints and \(1-\alpha\) the importance of the damage.

    Our optimisation function then becomes:


    Using the same constraints as before and taking \(\alpha=.5\) (so that both damage and hit points have equal weighting) we get \(x=(11,0,0,25,5,0,5,1,1)\). Thus we should be taking in to battle: 11 Barbarians, 25 Giants, 5 Wall Breakers, 5 Wizards, 1 Healer and 1 Dragon.

    This feels much more balanced than anything we've had so far and not far off what I've been using as my own raiding party.

    Given that everything is nicely coded we can quickly obtain the make up of the optimal raiding party for a range of values of \(\alpha\):

    We see that as \(\alpha\) increases (so that we care more about hit points) the number of Wizards we should use slowly decreases (at one point the constraint for ranged attacking units kicks in and we replace the Wizards with Archers).

    You could easily tweak the constraints to perhaps ensure your party always has some Goblins or otherwise. I think this is a nice example where a mathematical model can be used to solve a problem but at the same time must not be taken far from the problem itself. There might be some players who only choose to attack bases with a particular type of defence so they can tweak/add constraints to the problem accordingly.

    Feel free to play around with the code in the Sage cell below:

    The mathematical techniques used in this post rely on Linear Algebra, Higher Dimensional Geometry and obviously the ability to write code :)

    (On another note I'm still not in a clan on the game so if anyone has any space in a clan let me know!)

    by Vincent Knight (noreply@blogger.com) at May 11, 2014 10:06 AM

    May 06, 2014

    David Horgan (quantumtetrahedron)

    Enhanced by Zemanta

    This week thanks to some colleagues I have  been working on the spectrum of the Ricci operator for a monochromatic 4 – valent node dual to an equilateral tetrahedron. This blog post reports on working in progress,


    I have been reviewing the papers seen in the posts:

    Basically I am porting Mathematica code over to sagemath so that I can then use it it the calculation of the matrix elements of the  LQG Hamiltonian Constraint operator discussed in the  in the posts:

    So far I have written code for a number of operators, but I still have the same number still to do, After this I’ll need to join them together.

    The matrix defining the operator Q {e1, e2, e3} used in the definition of the volume operator


    H ithe matrix defining the operator δij.Yi.Yj used to define the length operator expressed in the intertwiner basis


    And B the intertwiner basis


    When complete I’ll be able to produce graphs such as those below which is a plot of the spectrum of R as a function of the spin. This can then e used in The numerical investigation of the LQG Hamiltonian Constraint Operator.





    Enhanced by Zemanta

    by David Horgan at May 06, 2014 07:18 PM

    William Stein

    Update to Differential Synchronization in SageMathCloud

    I've just pushed out a major update to how synchronization works in https://cloud.sagemath.com.

    This change is pretty significant behind the scenes, but the only difference you should notice is that everything should be better. In particular:

    • evaluation of code in Sage worksheet should feel a little snappier and more robust,
    • various random and hard to reproduce issues with synchronized editing should be fixed, e.g. chat messages out of order, etc.
    • everything should generally be a bit faster and more scalable overall.

    Here's a short technical description of what changed. The basic architecture of SageMathCloud is that there are many web browsers connected to many hubs, which are in turn connected to your project (and to many other projects too):

      [web browser] <- websocket ----\/
    [web browser] <------------> [hub]<------ tcp -------\/
    [web browser] <------------> [hub]<------------------/\

    Until today, the differential synchronization implementation involved having a copy of the document you're editing on:

    1. each hub pictured above,
    2. in each browser, and
    3. in the project itself.

    In particular, there were three slightly different implementations of differential synchronization running all over the place. The underlying core code is the same for all three, but the way it is used in each case is different, due to different constraints. The implementations:

    • browser: running in a web browser, which mainly has to worry about dealing with the CodeMirror editor and a flakie Internet connection.
    • hub: running in a node.js server that's also handling a lot of other stuff, including worrying about auth, permissions, proxying, logging, account creation, etc.
    • project: running in the project, which doesn't have to worry about auth or proxying or much else, but does have to worry about the filesystem.

    Because we're using Node.js, all three implementations are written in the same language (CoffeeScript), and run the same underlying core code (which I BSD licensed at https://github.com/sagemath/cloud/blob/master/diffsync.coffee). The project implementation was easiest to write, since it's very simple and straightforward, and has minimal constraints. The browser implementation is mainly difficult, since the Internet comes and goes (as laptops suspend/resume), and it this involves patching and diff'ing a CodeMirror editor instance; CodeMirror is difficult, because it views the document as a line instead of a single long string, and we want things to work even for documents with hundreds of thousands of lines, so converting back and forth to a string is not an option! Implementing the hub part of synchronization is the hardest, for various reasons -- and debugging it is particularly hard. Moreover, computing diffs can be computationally expensive if the document is large, so doing anything involving differential sync on the hub can result in nontrivial locking cpu usage, hence slower handling of other user messages (node.js is single threaded). The hub part of the above was so hard to get right that it had some nasty locking code, which shouldn't be needed, and just looked like a mess.

    A lot of issues people ran into with sync involved two browsers connected to different hubs, who then connected to the same document in a project. The two hubs' crappy synchronization would appear to work right in this part of the picture "[web browser] <------------> [hub]", but have problems with this part "[hub]<-------------->[project]", which would lead to pain later on. In many cases, the only fix was to restart the hub (to kill its sync state) or for the user to switch hubs (by clearing their browser cookies).

    Change: I completely eliminated the hub from the synchronization picture. Now the only thing the hub does related to sync is forward messages back and forth between the web browser and local hub. Implementing this was harder than one might think, because the the project considered each client to be a single tcp connection, but now many clients can connect a project via the same tcp connection, etc.

    With this fix, if there are any bugs left with synchronization, they should be much easier to debug. The backend scalability and robustness of sync have been my top priorities for quite a while now, so I'm happy to get this stuff cleaned up, and move onto the next part of the SMC project, which is better collaboration and course support.

    by William Stein (noreply@blogger.com) at May 06, 2014 10:38 AM

    May 01, 2014

    William Stein

    What can SageMathCloud (SMC) do?

    The core functionality of SageMathCloud:

    • Color command-line Terminal with many color schemes, which several people can interact with at once, with state that survives browser refresh.
    • Editing of documents: with syntax highlighting, auto-indent, etc., for files with the following extensions:
      c, c++, cql, cpp, cc, conf, csharp, c#, coffee, css, diff, dtd, e, ecl, f, f90, f95, h, hs, lhs, html, java, jl, js, lua, m, md, mysql, patch, gp, go, pari, php, py, pyx, pl, r, rst, rb, ru, sage, sagews, scala, sh, spyx, sql, txt, tex, toml, bib, bbl, xml, yaml.
    (It's easy for me to add more, as CodeMirror supports them.) There are many color schemes and Emacs and Vim bindings.
    • Sage Worksheets: a single document interactive way to evaluate Sage code. This is highly extensible, in that you can define % modes by simply making a function that takes a string as input, and use %default_mode to make that mode the default. Also, graphics actually work in the %r automatically, exactly as in normal R (no mucking with devices or png's).
    • IPython notebooks: via an IPython session that is embedded in an iframe. This is synchronized, so that multiple users can interact with a notebook simultaneously, which was a nontrivial addition on top of IPython.
    • LaTeX documents: This fully supports sagetex, bibtex, etc. and the LaTeX compile command is customizable. This also has forward and inverse search, i.e., double click on preview to get to point in tex file and alt-enter in tex file to get to point in LaTeX document. In addition, this editor will work fine with 150+ page documents by design. (Editing multiple document files are not properly supported yet.)
    • Snapshots: the complete state of all files in your project are snapshotted (using bup, which is built on git) every 2 minutes, when you're actively editing a file. All of these snapshots are also regularly backed up to encrypted disks offsite, just in case. I plan for these highly efficient deduplicated compressed snapshots to be saved indefinitely. Browse the snapshots by clicking "Snapshots" to the right when viewing your files or type cd ~/.snapshots/master in the terminal.
    • Replication: every project is currently stored in three physically separate data centers; if a machine or data center goes down, your project pops up on another machine within about one minute. A computer at every data center would have to fail for your project to be inaccessible. I've had zero reports of projects being unavailable since I rolled out this new system 3 weeks ago (note: there was a project that didn't work, but that was because I had set the quota to 0% cpu by accident).


    The Sage install contains the following extra packages (beyond what is standard in Sage itself). When you use Sage or IPython, this will all be available.
    basemap, biopython, biopython, bitarray, brian, cbc, chomp, clawpack, cluster_seed, coxeter3, cryptominisat, cunningham_tables, database_cremona_ellcurve, database_gap, database_jones_numfield, database_kohel, database_odlyzko_zeta, database_pari, database_symbolic_data, dot2tex, fabric, gap_packages, gnuplotpy, greenlet, guppy, h5py, httplib2, kash3, lie, lrs, lxml, mahotas, mercurial, mpld3, munkres, mysql-python, nauty, netcdf4, neuron, normaliz, nose, nose, numexpr, nzmath, oct2py, p_group_cohomology, pandas, paramiko, patsy, patsy, phc, plotly, psutil, psycopg2, pybtex, pycryptoplus, pyface, pymongo, pyproj, pyx, pyzmq, qhull, quantlib, redis, requests, rpy2, scikit_learn, scikits-image, scimath, shapely, simpy, snappy, statsmodels, stein-watkins-ecdb, tables, theano, topcom, tornado, traits, xlrd, xlwt, zeromq


    Also, I install the following extra packages into the R that is in Sage:
    KernSmooth, Matrix, Rcpp, cairo, car, circular, cluster, codetools, e1071, fields, ggplot2, glmnet, lattice, mgcv, mvtnorm, plyr, reshape2, rpart, stringr, survival, zoo

    It's Linux

    SMC can do pretty much anything that doesn't require X11 that can be done with an Ubuntu-14.04 Linux can be done. I've pre-installed the following packages, and if people want others, just let me know (and they will be available to all projects henceforth):
    vim git wget iperf dpkg-dev make m4 g++ gfortran liblzo2-dev libssl-dev libreadline-dev  libsqlite3-dev libncurses5-dev git zlib1g-dev openjdk-7-jdk libbz2-dev libfuse-dev pkg-config libattr1-dev libacl1-dev par2 ntp pandoc ssh python-lxml  calibre  ipython python-pyxattr python-pylibacl software-properties-common  libevent-dev xfsprogs lsof  tk-dev  dstat emacs vim texlive texlive-* gv imagemagick octave mercurial flex bison unzip libzmq-dev uuid-dev scilab axiom yacas octave-symbolic quota quotatool dot2tex python-numpy python-scipy python-pandas python-tables libglpk-dev python-h5py zsh python3 python3-zmq python3-setuptools cython htop ccache python-virtualenv clang libgeos-dev libgeos++-dev sloccount racket libxml2-dev libxslt-dev irssi libevent-dev tmux sysstat sbcl gawk noweb libgmp3-dev ghc  ghc-doc ghc-haddock ghc-mod ghc-prof haskell-mode haskell-doc subversion cvs bzr rcs subversion-tools git-svn markdown lua5.2 lua5.2-*  encfs auctex vim-latexsuite yatex spell cmake libpango1.0-dev xorg-dev gdb valgrind doxygen haskell-platform haskell-platform-doc haskell-platform-prof  mono-devel mono-tools-devel ocaml ocaml-doc tuareg-mode ocaml-mode libgdbm-dev mlton sshfs sparkleshare fig2ps epstool libav-tools python-software-properties software-properties-common h5utils libnetcdf-dev netcdf-doc netcdf-bin tig libtool iotop asciidoc autoconf bsdtar attr  libicu-dev iceweasel xvfb tree bindfs liblz4-tool tinc  python-scikits-learn python-scikits.statsmodels python-skimage python-skimage-doc  python-skimage-lib python-sklearn  python-sklearn-doc  python-sklearn-lib python-fuse cgroup-lite cgmanager-utils cgroup-bin libpam-cgroup cgmanager cgmanager-utils cgroup-lite  cgroup-bin r-recommended libquantlib0 libquantlib0-dev quantlib-examples quantlib-python quantlib-refman-html quantlib-ruby r-cran-rquantlib  libf2c2-dev libpng++-dev libcairomm-1.0-dev r-cran-cairodevice x11-apps mesa-utils libpangox-1.0-dev
    I've also put extra effort (beyond just apt-get) to install the following:
    polymake, dropbox, aldor/"AXIOM", Macaulay2, Julia, 4ti2

    Functionality that is currently under development

    We're working hard on improving SageMathCloud right now.

    • Streamlining document sync: will make code evaluation much faster, eliminate some serious bugs when the network is bad, etc.
    • Geographic load balancing and adding data centers, so that, e.g., if you're in Europe or Asia you can use SMC with everything happening there. This will involve DNS load balancing via Amazon Route 53, and additionally moving projects to run on the DC that is nearest you on startup, rather than random. Right now all computers are in North America.
    • Mounting a folder of one project in another project, in a way that automatically fixes itself in case a machine goes down, etc. Imagine mounting the projects of all 50 students in your class, so you can easily assign and collect homework, etc.
    • Homework assignment and grading functionality with crowdsourcing of problem creation, and support for peer and manual grading.
    • BSD-licensed open source single-project version of SMC.
    • Commercial software support and instructions for how to install your own into SMC (e.g., Mathematica, Matlab, Magma, etc.)
    • ssh access into projects easily

    by William Stein (noreply@blogger.com) at May 01, 2014 02:27 PM

    April 25, 2014

    William Stein

    The SageMathCloud Roadmap

    Everything below is subject to change.

    Implementation Goals

    • (by April 27) Major upgrades -- update everything to Ubuntu 14.04 and Sage-6.2. Also upgrade all packages in SMC, including Haproxy, nginx, stunnel, etc.

    • (by May 4) Streamline doc sync: top priority right now is to clean up sync, and eliminate bugs that show up when network is bad, many users, etc.

    • (by May 30) Snapshots:
      • more efficient way to browse snapshot history (timeline view)
      • browse snapshots of a single file (or directory) only
    • (by May 30) User-owned backups
      • way to download complete history of a project, i.e,. the underlying bup/git repository with snapshot history.
      • way to update offline backup, getting only the changes since last download.
      • easy way to download all current files as a zip or tarball (without the snapshots).
    • (by June 30) Public content
      • Ability to make a read-only view of the project visible publicly on the internet. Only works after the account is at least n days old.
      • By default, users will have a "report spammer" button on each page. Proven 'good users' will have button removed. Any valid reported users will be permanently banned.
      • Only users with validated .edu accounts will be allowed to publish for starters. Maybe allow gmail.com at some point.
    • (by June 30) Fix all major issues (none major) that are listed on the github page: https://github.com/sagemath/cloud/issues

    • (by July 31) Group/social features:
      • Support for mounting directories between projects
      • Group management: combine a group of users and projects into a bigger entity:
        • a University course -- see feature list below
        • a research group: a collection of projects with a theme that connects them, where everybody has access to everybody else's projects
      • A feed that shows activity on all projects that users care about, with some ranking. Better notifications about chat messages and activity

    Commercial Products

    We plan four distinct products of the SMC project: increased quotas, enhanced university course support, license and support to run a private SMC cloud, supported open source BSD-licensed single-user version of SMC (and offline mode).

    • PRODUCT: Increase the quota for a specific project (launch by Aug 2014)
      • cpu cores
      • RAM
      • timeout
      • disk space
      • number of share mounts
    • Remarks:
      • There will be an option in the UI to change each of the above parameters that some project collabs see (maybe only owners initially).
      • Within moments of making a change it goes live and billing starts.
      • When the change is turned off, billing stops. When a project is not running it is not billed. (Obviously, we need to add a stop button for projects.)
      • There is a maximum amount that the user can pre-spend (e.g., $500 initially).
      • At the end of the month, the user is given a link to a Univ of Washington website and asked to pay a certain amount, and register there under the same email as they use with SMC.
      • When they pay, SMC receives an email and credits their account for the amount they pay.
      • There will also be a limit soon on the number of projects that can be associated with an account (e.g., 10); pay a small monthly to raise this.
    • PRODUCT: University course support (launch by Aug 2014 in time for Fall 2014 semester)

      • Free for the instructor and TA's
      • Each student pays $20 in exchange for:
        • one standard project (they can upgrade quotas as above), which TA and instructor are automatically collabs on
        • student is added as collaborator to a big shared project
        • in student's private project they get homework assignments (assigned, collected)
      • Instructor's project has all student projects as mounted shares
      • Instructor has a student data spreadsheet with student grades, project ids (links), etc.
      • Powerful modern tool for designing homework problems that can be automatically graded, with problems shared in a common pool, with ratings, and data about their usage.
      • A peer grading system for more advanced courses.
      • Tools to make manual grading more fun.
    • PRODUCT: License to run a private SMC cloud in a research lab, company, supercomputer, etc. (launch a BETA version by July 2014, with caveats about bugs).

      • base fee (based on organization size)
      • technical support fee
      • site visit support: install, run workshop/class
    • PRODUCT: Free BSD-licensed single-user account version of SMC (launch by December 2014)

      • a different way to do LaTeX editing, manage a group of IPython notebooks, use Sage worksheets, etc.
      • be included with Sage.
      • be included in many Linux distros
      • the doc synchronization code, local_hub, CoffeeScript client, terminal.
      • mostly Node.js application (with a little Python for Sage/IPython).
      • ability to sync with a cloud-hosted SMC project.
      • sell pre-configured (or just support) for a user to install standalone-SMC on some cloud host such as EC2 or Digital Ocean

    by William Stein (noreply@blogger.com) at April 25, 2014 12:08 PM

    April 23, 2014

    Nikhil Peter

    GSoC Selection 2014

    I was recently lucky enough to be selected for GSoC 2014 for SageMath, specifically on improvements to the Sage Android App, both internal and external.

    The current agenda during the Community Bonding Period(upto 18th May) is:

    • Read and understand the current codebase in greater detail.
    • Fix some already existing issues on the git repo which aren’t part of the already proposed fixes to get a better feel for the code.
    • Have a meetup with my mentor, Volker Braun and possibly thrash out some of the finer points of the timeline and the proposed fixes.
    • Decide on some of the libraries to be used, again, during the above meetup.
    • Get this blog up and running

    I also have my end semester exams intermittently during 5th May-17th May,so some balancing will also be necessary.

    Other than that I’m very thrilled,happy and pleasantly surprised that I actually got this opportunity.

    Here’s to a great summer!

    by hav3n at April 23, 2014 03:04 PM

    April 22, 2014

    Harald Schilly

    Sage GSoC 2014 Projects

    This year, Sage is happy to announce that it will be running five Google Summer of Code projects. Welcome everyone to the Sage community and all the best to your summer filled with exiting projects ;-)

    EDIT: lmonade related projects are listed here.

    Nikhil Peter Raj (Volker Braun):

    Improvements to the Sage Android App

    This project aims to improve upon the existing app by improving both internal structure and external user experience, the former by replacing and redoing certain modules of the app (JSON Parsing, HTTP Requests and Exception Handling) and the latter by redesigning the UI so it conforms to the latest standards and specifications of Android Design, as well as adding extra features to the code input such as Syntax Highlighting and options to record and store custom inserts.

    Amit Jamadagni (Miguel Angel Marco-Buzunariz)

    Knot theory implementation

    The project mainly deals with the implementation of various links, knots presentations and calculating various related invariants. The main aim of the project is to achieve: Conversion between different representations of Knots and Links to mention a few: Gauss code to Braid Word, Braid word to DT code, Gauss code to DT code. Implementation of various invariants: Alexander Polynomial, Conway Polynomial. Stiefert Matrix, Jones Polynomial

    Jayant Apte (Stefan van Zwam)

    Efficient class-specific membership checks, extensions and visualization 

    First goal of this proposal is to implement efficient testing of a matroid for membership of various classes and to perform single element extensions of matroids that exploit the knowledge that a given matroid belongs to a certain class, to have lower complexity. We also propose implementation of visualization techniques and an application of matroid theory in form of enumeration of network codes for a multisource network coding problem which is one of the timely problems in information theory.

    Simon Spicer (William Stein, Jonathan Bober)

    Efficient exact analytic rank computation for elliptic curves in Sage

    My project would be to implement functionality in Sage to compute the analytic rank of an elliptic curve exactly modulo standard conjectures, with better scaling than existing analytic rank estimation methods. This would necessitate both writing (significant amounts of) new Python and Cython code, as well as porting existing code into the Sage codebase.

    Daniel Bell (Ivan Andrus)

    iSage - improving the Sage iOS apps

    There are several aspects of the Sage iOS apps that needs to be refreshed, improved or implemented. I propose to develop the ability to interact with several Sage services from the app, as well refresh the user interface for the app.

    by Harald Schilly (noreply@blogger.com) at April 22, 2014 06:10 PM

    Martin Albrecht

    LMonade GSoC 2014 Accepted Projects

    The list of accepted projects of this year’s Google Summer of Code is out. For the list of accepted projects for Sage see here, for the LMonade project see below, for all other accepted projects see Google’s site. I am going to mentor William’s M1RI project together with Clément Pernet. It’s going to be a […]

    by martinralbrecht at April 22, 2014 10:15 AM

    April 15, 2014

    David Horgan (quantumtetrahedron)

    Enhanced by Zemanta

    This week I have been continuing my collaborative work on the matrix elements of the Hamiltonian Constraint in LQG, see the posts:

    So I have been reviewing knot theory especially skein relations, SU(2) recoupling theory and Lieb-Temperley algebra. And reviewing my old sagemath calculations which I’ve written up as a paper and will post as an eprint on arXiv.

    The calculations I have  done this week are based on the paper ‘Matrix Elements of Thiemann’s Hamiltonian Constraint in Loop Quantum Gravity by Borissov , De Pietri and Rovelli‘.

    In this paper the authors present an explicit computation of matrix elements of the hamiltonian constraint operator in non-perturbative quantum gravity. They look at the euclidean term of Thiemann’s version of the constraint and compute its action on trivalent states, for all its natural orderings. The calculation is performed using
    graphical techniques from the recoupling theory of colored knots and links. They  give the matrix elements of the Hamiltonian constraint operator in the spin network basis in compact algebraic form.

    Thiemann’s Hamiltonian constraint operator.
    The Hamiltonian constraint of general relativity can be written as



    The first term in the right hand side of  defines
    the hamiltonian constraint for gravity with euclidean signature – accordingly, he is called the euclidean hamiltonian constraint.


    For trivalent gauge-invariant vertices,  (r, p, q) denotes the
    colours associated to the edges I, J,K of a vertex v and we have:


    where i = 1, 2 for the two orderings  studied. The explicit values of the matrix elements  a1 and a2are:







    Using Sage Mathematics Software I coded these the trivalent vertex matrix elements :matrixelementsfig1

    And then checked the resulting calculations against the values tabulated in the paper.


    here we can see  A1 and A2 values corresponding to the values of :




    My Sage Mathematics Software  code produces precisely the numerical values of A1 and A2 as indicated in the figure above.

    For example, for (p,q,r)  = (1,0,1) I have A1 =0.000 and A2 = 0,07433 which is to 4 digit accuracy the tabulated value of 101.

    From the point of view of my own research this is great and I can be confident in the robustness of my coding techniques. I will continue working through papers on the matrix elements of the LQG Hamiltonian constraints coding up the analyses working towards coding the full hamiltonian constraint and curvature reviewed in the posts:

    Enhanced by Zemanta

    by David Horgan at April 15, 2014 07:58 PM

    William Stein

    SageMathCloud's new storage architecture

    Keywords: ZFS, bup, rsync, Sage

    SageMathCloud (SMC) is a browser-based hosted cloud computing environment for easily collaborating on Python programs, IPython notebooks, Sage worksheets and LaTeX documents. I spent the last four months wishing very much that less people would use SMC. Today that has changed, and this post explains some of the reasons why.

    Consistency Versus Availability

    Consistency and availability are competing requirements. It is trivial to keep the files in a SageMathCloud project consistent if we store it in exactly one place; however, when the machine that project is on goes down for any reason, the project stops working, and the users of the project are very unhappy. By making many copies of the files in a project, it's fairly easy to ensure that the project is always available, even if network switches in multiple data centers completely fail, etc. Unfortunately, if there are too many users and the synchronization itself puts too heavy of a load on the overall system, then machines will fail more frequently, and though projects are available, files do not stay consistent and data is lost to the user (though still "out there" somewhere for me to find).

    Horizontal scalability of file storage and availability of files are also competing requirements. If there are a few compute machines in one place, then they can all mount user files from one central file server. Unfortunately, this approach leads to horrible performance if instead the network is slow and has high latency; it also doesn't scale up to potentially millions of users. A benchmark I care about is downloading a Sage binary (630MB) and extracting it (creating over 70,000 files); I want this to take at most 3 minutes total, which is hard using a networked filesystem served over the general Internet between data centers. Instead, in SMC, we store the files for user projects on the compute machines themselves, which provides optimal speed. Moreover, we use a compressed filesystem, so in many cases read and write speeds are nearly twice as fast as they might be otherwise.

    New Architecture of SageMathCloud

    An SMC project with id project_id consists of two directories of files, replicated across several machines using rsync:
    1. The HOME directory: /projects/project_id
    2. A bup repository: /bup/bups/project_id
    Users can also create files they don't care too much about in /scratch, which is a compressed and deduplicated ZFS filesystem. It is not backed up in any way, and is local to that compute.

    The /projects directory is one single big ZFS filesystem, which is both lz4 compressed and deduplicated. ZFS compression is just plain awesome. ZFS deduplication is much more subtle, as deduplication is tricky to do right. Since data can be deleted at any time, one can't just use a bloom filter to very efficiently tell whether data is already known to the filesystem, and instead ZFS uses a much less memory efficient data structure. Nonetheless, deduplication works well in our situation, since the compute machines all have sufficient RAM (around 30-60GB), and the total data stored in /projects is well under 1TB. In fact, right now most compute machines have about 100GB stored in /projects.
    The /bup/bups directory is also one single big ZFS filesystem; however, it is neither compressed nor deduplicated. It contains bup repositories, where bup is an awesome git-based backup tool written in Python that is designed for storing snapshots of potentially large collections of arbitrary files in a compressed and highly deduplicated way. Since the git pack format is already compressed and deduplicated, and bup itself is highly efficient at deduplication, we would gain almost nothing by using compression or deduplication directly on this ZFS filesystem. When bup deduplicates data, it does so using a sliding window through the file, unlike ZFS which simply breaks the file up into blocks, so bup does a much better job at deduplication. Right now, most compute machines have about 50GB stored in /bup/bups.

    When somebody actively uses a project, the "important" working files are snapshotted about once every two minutes. These snapshots are done using bup and stored in /bup/bups/project_id, as mentioned above. After a snapshot is successfully created, the files in the working directory and in the bup repository are copied via rsync to each replica node. The users of the project do not have direct access to /bup/bups/project_id, since it is of vital importance that these snapshots cannot be corrupted or deleted, e.g., if you are sharing a project with a fat fingered colleague, you want peace of mind that even if they mess up all your files, you can easily get them back. However, all snapshots are mounted at /projects/project_id/.snapshots and browseable by the user; this uses bup's FUSE filesystem support, enhanced with some patches I wrote to support file permissions, sizes, change times, etc. Incidentally, the bup snapshots have no impact on the user's disk quota.

    We also backup all of the bup archives (and the database nodes) to a single large bup archive, which we regularly backup offsite on encrypted USB drives. Right now, with nearly 50,000 projects, the total size of this large bup archive is under 250GB (!), and we can use it efficiently recover any particular version of any file in any project. The size is relatively small due to the excellent deduplication and compression that bup provides.

    In addition to the bup snapshots, we also create periodic snapshots of the two ZFS filesystems mentioned above... just in case. Old snapshots are regularly deleted. These are accessible to users if they search around enough with the command line, but are not consistent between different hosts of the project, hence using them is not encouraged. This ensures that even if the whole replication/bup system were to somehow mess up a project, I can still recover everything exactly as it was before the problem happened; so far there haven't been any reports of problems.


    Right now there are about 6000 unique weekly users of SageMathCloud and often about 300-400 simultaneous users, and there are nearly 50,000 distinct projects. Our machines are at about 20% disk space capacity, and most of them can easily be expanded by a factor of 10 (from 1TB to 12TB). Similarly, disk space for our Google compute engine nodes is $0.04 GB / month. So space-wise we could scale up by a factor of 100 without too much trouble. The CPU load is at about 10% as I write this, during a busy afternoon with 363 clients connected very actively modifying 89 projects. The architecture that we have built could scale up to a million users, if only they would come our way...

    by William Stein (noreply@blogger.com) at April 15, 2014 04:02 PM

    April 11, 2014

    Sébastien Labbé

    My status report at Sage Days 57 (RecursivelyEnumeratedSet)

    At Sage Days 57, I worked on the trac ticket #6637: standardize the interface to TransitiveIdeal and friends. My patch proposes to replace TransitiveIdeal and SearchForest by a new class called RecursivelyEnumeratedSet that would handle every case.

    A set S is called recursively enumerable if there is an algorithm that enumerates the members of S. We consider here the recursively enumerated set that are described by some seeds and a successor function succ. The successor function may have some structure (symmetric, graded, forest) or not. Many kinds of iterators are provided: depth first search, breadth first search or elements of given depth.

    TransitiveIdeal and TransitiveIdealGraded

    Consider the permutations of \(\{1,2,3\}\) and the poset generated by the method permutohedron_succ:

    sage: P = Permutations(3)
    sage: d = {p:p.permutohedron_succ() for p in P}
    sage: S = Poset(d)
    sage: S.plot()

    The TransitiveIdeal allows to generates all permutations from the identity permutation using the method permutohedron_succ as successor function:

    sage: succ = attrcall("permutohedron_succ")
    sage: seed = [Permutation([1,2,3])]
    sage: T = TransitiveIdeal(succ, seed)
    sage: list(T)
    [[1, 2, 3], [2, 1, 3], [1, 3, 2], [2, 3, 1], [3, 2, 1], [3, 1, 2]]

    Remark that the previous ordering is neither breadth first neither depth first. It is a naive search because it stores the element to process in a set instead of a queue or a stack.

    Note that the method permutohedron_succ produces a graded poset. Therefore, one may use the TransitiveIdealGraded class instead:

    sage: T = TransitiveIdealGraded(succ, seed)
    sage: list(T)
    [[1, 2, 3], [2, 1, 3], [1, 3, 2], [2, 3, 1], [3, 1, 2], [3, 2, 1]]

    For TransitiveIdealGraded, the enumeration is breadth first search. Althougth, if you look at the code (version Sage 6.1.1 or earlier), we see that this iterator do not make use of the graded hypothesis at all because the known set remembers every generated elements:

    current_level = self._generators
    known = set(current_level)
    depth = 0
    while len(current_level) > 0 and depth <= self._max_depth:
        next_level = set()
        for x in current_level:
            yield x
            for y in self._succ(x):
                if y == None or y in known:
        current_level = next_level
        depth += 1

    Timings for TransitiveIdeal

    sage: succ = attrcall("permutohedron_succ")
    sage: seed = [Permutation([1..5])]
    sage: T = TransitiveIdeal(succ, seed)
    sage: %time L = list(T)
    CPU times: user 26.6 ms, sys: 1.57 ms, total: 28.2 ms
    Wall time: 28.5 ms
    sage: seed = [Permutation([1..8])]
    sage: T = TransitiveIdeal(succ, seed)
    sage: %time L = list(T)
    CPU times: user 14.4 s, sys: 141 ms, total: 14.5 s
    Wall time: 14.8 s

    Timings for TransitiveIdealGraded

    sage: seed = [Permutation([1..5])]
    sage: T = TransitiveIdealGraded(succ, seed)
    sage: %time L = list(T)
    CPU times: user 25.3 ms, sys: 1.04 ms, total: 26.4 ms
    Wall time: 27.4 ms
    sage: seed = [Permutation([1..8])]
    sage: T = TransitiveIdealGraded(succ, seed)
    sage: %time L = list(T)
    CPU times: user 14.5 s, sys: 85.8 ms, total: 14.5 s
    Wall time: 14.7 s

    In conlusion, use TransitiveIdeal for naive search algorithm and use TransitiveIdealGraded for breadth search algorithm. Both class do not use the graded hypothesis.

    Recursively enumerated set with a graded structure

    The new class RecursivelyEnumeratedSet provides all iterators for each case. The example below are for the graded case.

    Depth first search iterator:

    sage: succ = attrcall("permutohedron_succ")
    sage: seed = [Permutation([1..5])]
    sage: R = RecursivelyEnumeratedSet(seed, succ, structure='graded')
    sage: it_depth = R.depth_first_search_iterator()
    sage: [next(it_depth) for _ in range(5)]
    [[1, 2, 3, 4, 5],
     [1, 2, 3, 5, 4],
     [1, 2, 5, 3, 4],
     [1, 2, 5, 4, 3],
     [1, 5, 2, 4, 3]]

    Breadth first search iterator:

    sage: it_breadth = R.breadth_first_search_iterator()
    sage: [next(it_breadth) for _ in range(5)]
    [[1, 2, 3, 4, 5],
     [1, 3, 2, 4, 5],
     [1, 2, 4, 3, 5],
     [2, 1, 3, 4, 5],
     [1, 2, 3, 5, 4]]

    Elements of given depth iterator:

    sage: list(R.elements_of_depth_iterator(9))
    [[5, 4, 2, 3, 1], [4, 5, 3, 2, 1], [5, 3, 4, 2, 1], [5, 4, 3, 1, 2]]
    sage: list(R.elements_of_depth_iterator(10))
    [[5, 4, 3, 2, 1]]

    Levels (a level is a set of elements of the same depth):

    sage: R.level(0)
    [[1, 2, 3, 4, 5]]
    sage: R.level(1)
    {[1, 2, 3, 5, 4], [1, 2, 4, 3, 5], [1, 3, 2, 4, 5], [2, 1, 3, 4, 5]}
    sage: R.level(2)
    {[1, 2, 4, 5, 3],
     [1, 2, 5, 3, 4],
     [1, 3, 2, 5, 4],
     [1, 3, 4, 2, 5],
     [1, 4, 2, 3, 5],
     [2, 1, 3, 5, 4],
     [2, 1, 4, 3, 5],
     [2, 3, 1, 4, 5],
     [3, 1, 2, 4, 5]}
    sage: R.level(3)
    {[1, 2, 5, 4, 3],
     [1, 3, 4, 5, 2],
     [1, 3, 5, 2, 4],
     [1, 4, 2, 5, 3],
     [1, 4, 3, 2, 5],
     [1, 5, 2, 3, 4],
     [2, 1, 4, 5, 3],
     [2, 1, 5, 3, 4],
     [2, 3, 1, 5, 4],
     [2, 3, 4, 1, 5],
     [2, 4, 1, 3, 5],
     [3, 1, 2, 5, 4],
     [3, 1, 4, 2, 5],
     [3, 2, 1, 4, 5],
     [4, 1, 2, 3, 5]}
    sage: R.level(9)
    {[4, 5, 3, 2, 1], [5, 3, 4, 2, 1], [5, 4, 2, 3, 1], [5, 4, 3, 1, 2]}
    sage: R.level(10)
    {[5, 4, 3, 2, 1]}

    Recursively enumerated set with a symmetric structure

    We construct a recursively enumerated set with symmetric structure and depth first search for default enumeration algorithm:

    sage: succ = lambda a: [(a[0]-1,a[1]), (a[0],a[1]-1), (a[0]+1,a[1]), (a[0],a[1]+1)]
    sage: seeds = [(0,0)]
    sage: C = RecursivelyEnumeratedSet(seeds, succ, structure='symmetric', algorithm='depth')
    sage: C
    A recursively enumerated set with a symmetric structure (depth first search)

    In this case, depth first search is the default algorithm for iteration:

    sage: it_depth = iter(C)
    sage: [next(it_depth) for _ in range(10)]
    [(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6), (0, 7), (0, 8), (0, 9)]

    Breadth first search. This algorithm makes use of the symmetric structure and remembers only the last two levels:

    sage: it_breadth = C.breadth_first_search_iterator()
    sage: [next(it_breadth) for _ in range(10)]
    [(0, 0), (0, 1), (0, -1), (1, 0), (-1, 0), (-1, 1), (-2, 0), (0, 2), (2, 0), (-1, -1)]

    Levels (elements of given depth):

    sage: sorted(C.level(0))
    [(0, 0)]
    sage: sorted(C.level(1))
    [(-1, 0), (0, -1), (0, 1), (1, 0)]
    sage: sorted(C.level(2))
    [(-2, 0), (-1, -1), (-1, 1), (0, -2), (0, 2), (1, -1), (1, 1), (2, 0)]

    Timings for RecursivelyEnumeratedSet

    We get same timings as for TransitiveIdeal but it uses less memory so it might be able to enumerate bigger sets:

    sage: succ = attrcall("permutohedron_succ")
    sage: seed = [Permutation([1..5])]
    sage: R = RecursivelyEnumeratedSet(seed, succ, structure='graded')
    sage: %time L = list(R)
    CPU times: user 24.7 ms, sys: 1.33 ms, total: 26.1 ms
    Wall time: 26.4 ms
    sage: seed = [Permutation([1..8])]
    sage: R = RecursivelyEnumeratedSet(seed, succ, structure='graded')
    sage: %time L = list(R)
    CPU times: user 14.5 s, sys: 70.2 ms, total: 14.5 s
    Wall time: 14.6 s

    by Sébastien Labbé at April 11, 2014 05:15 PM

    April 07, 2014

    The Matroid Union

    Matroid Computation with Sage: comparing matroids

    I’m back with more on matroid computation in Sage. Previous installments are here, here, and here. As in the previous post, clicking the “Evaluate” buttons below will execute the code and return the answer. The various computations are again linked, so execute them in the right order.

    Today we will look at various ways to ask (and answer) the question “Are these two matroids equal?” There are some subtle aspects to this, since we had to balance efficiency of the code and mathematical interpretation. The result is that we have various ways to compare matroids.


    Matroids $M$ and $N$ are isomorphic if there is a bijection $\phi: E(M) \to E(N)$ mapping bases to bases and nonbases to nonbases. In Sage, we check this as follows:

    In other words, every matroid in Sage has a method .is_isomorphic() taking another matroid as input and returning True or False. Currently there is no way to extract the actual isomorphism (it is on our wish list), but if you guessed one, you can check its correctness:

    We defined $\phi$ using a dictionary: for instance, phi['c'] equals 2. If you defined your map differently (e.g. as a function or a permutation), Sage will probably understand that too.


    Matroids $M$ and $N$ are equal if $E(M) = E(N)$ and the identity map is an isomorphism. We can check for equality as follows:


    The standard way to compare two objects in Sage is through the comparison operator ==. For instance,

    When writing the matroid code, we chose to interpret the question M == N to mean “Is the internal representation of the matroid $M$ equal to the internal representation of $N$?” This is a very restricted view: the only time M == N will return True is if

    • $M$ and $N$ have the same internal data structure (i.e. both are of type BasisMatroid or both are of type LinearMatroid or …)
    • $M$ and $N$ are equal as matroids
    • The internal representations are equivalent (this is at the moment only relevant for linear matroids).

    Let’s consider a few examples:

    So only matroids $M_1$, $M_2$, and $M_4$ pass the equality test. This is because they are all linear matroids over the field $\mathrm{GF}(2)$, have the same ground set, and the matrices are projectively equivalent: one can be turned into the other using only row operations and column scaling. Matrix $M_3$ is isomorphic to $M_1$ but not equal; matroid $M_5$ is represented over a different field; matroid $M_6$ is represented by a list of bases, and matroid $M_7$ is represented by a list of “circuit closures”.

    This notion of equality has consequences for programming. In Python, a set is a data structure containing at most one copy of each element.

    So $S$ actually contains $M_3, M_5, M_6, M_7$, and one copy of $M_1, M_2, M_4$.

    This means, unfortunately, that you cannot rely on Python’s default filtering tools for sets if you want only matroidally equal objects, or only isomorphic objects. But those equality tests are way more expensive computationally, whereas in many applications the matroids in a set will be of the same type anyway.

    Inequivalent representations

    As mentioned above, each instance of the LinearMatroid class is considered a represented matroid. There are specialized methods for testing projective equivalence and projective isomorphism. Recall that matroids are projectively equivalent (in Sage’s terminology, field-equivalent) if the representation matrices are equal up to row operations and column scaling. So below, matroids $M_1$ and $M_3$ are field-equivalent; matroids $M_1$ and $M_4$ are field-isomorphic; and matroid $M_2$ has an inequivalent representation:

    I probably caused a lot of confusion, so feel free to ask questions in the comments below. Also, the documentation for the functions discussed above gives more explanations and examples.

    by Stefan van Zwam at April 07, 2014 09:32 PM

    April 04, 2014

    David Horgan (quantumtetrahedron)

    Enhanced by Zemanta

    This post is just a bit of informal exploration of the LQG curvature operator which I’ve been doing in preparation for a more formal attack during the next week.


    which produces results like those below:



    Informally exploring some ideas in sagemath using work already done on length, angle and volume operators.

    See posts





    This informal work produces reasonable results in terms of magnitudes and form and I’ll refine this over time.

    Enhanced by Zemanta

    by David Horgan at April 04, 2014 09:25 PM

    Vince Knight

    A list of stuff for my student to look at before getting down to some Sage development

    +James Campbell will be working with me this Summer on a project aimed at developing game theoretical stuff in to +Sage Mathematical Software System. James just emailed me asking for a list of stuff he could/should read up on before he starts. I thought more knowledgeable people than me might be able to contribute so I've lazily copied my email to him here: 

    ------------ email ------------


    - git
    - sage development (tickets, documentation etc... This is something I don't know much about myself so read about it on the Safe site and watch videos on youtube there are a bunch of them)
    - cython (http://cython.org/ - watch this intro to Sage lecture by +William Steinhttps://www.youtube.com/watch?v=fI4NlMfGHC0 that's the first lecture in a class he's currently giving you also could watch the rest)
    - C (to help with cython - you don't necessarily need to be an expert I think)
    Test driven development: (watch all this and you will know what I mean: https://www.youtube.com/playlist?list=PL5859017B018F03F4)
    - ssh and *nix (so that you're comfortable to jump on to one of my machines if necessary - depending on time we might also get you to tweak the Cardiff server)
    - matplotlib (the python library that Sage uses to plot stuff, good to know it from a python pov so as to be able to get Sage to make it do what we want - we might or might not use this)
    - How Sage plots graphs (graph theory graphs like I used for this: http://goo.gl/KHGYk7 - we might or might not need this)

    Game Theory:

    We'll talk about this but 1 of the above (easy to code: 30 minutes of work) will be a gentle appetiser to the 'piece de resistance': normal form games,

    - Normal form games (first 6 chapters of http://www.vincent-knight.com/teaching/gametheory/)
    - The lrs algorithm (there is an implementation of this written in c that we either want to re-write to get working in Sage so you'll need to understand it or get Sage to talk to it / use it, I know Sage kind of has this as an optional library but I'm not entirely sure how to 'get at it' http://cgm.cs.mcgill.ca/~avis/C/lrs.html)
    - Polytopes, you want to be comfortable-ish with the vocabulary around polytopes to be able to understand the lrs algorithm a bit. 


    - In general I'd say don't spend much money on Python books. Like most OSS stuff there's an awesome amount of stuff online. Take a look at: http://pythonbooks.revolunet.com/ (a list of free Python books). There are various exceptions to this rule though.

    - With regards to Sage I don't think you need a book for this project (as it's about building stuff for Sage so mainly reading about the development process and watching youtube videos is the way to go), I have a kindle copy of http://goo.gl/q9s9da, it's not bad but really everything is online. If you haven't already take a look at http://sagemath.org/help.html and join the relevant discussion groups on there.

    - With regards to Game Theory there's a reading list on my website (all that follow are linked to on there). Webb's book is a gentle introduction, Algorithmic Game Theory is great and more about what you will be looking at. Finally there's a newish book by Maschler which looks really nice but I've not had time to work through it yet. In general my course site should suffice (reading/working through those books could take you years) with regards to what you need to know for game theory and I am certainly not asking you to spend money on a book. If there's a book (GT or code) that you really think would be useful let's talk.


    James is a pretty good coder with a good knowledge of a git based workflow already as his first year project (during which he actually learnt to code) has led his group to develop: http://thefightclub.herokuapp.com/ which is also on github (if you've made your way this far, please click on that and waste a minute or 2 of your life).

    If there's anything missing from this list please add it to the comments :)

    I'm looking forward to this project.

    by Vincent Knight (noreply@blogger.com) at April 04, 2014 11:57 AM

    March 27, 2014

    Vince Knight

    Scheduling group presentations using Graph Theory and Sage.

    Yesterday (2014-03-26) I spoke at the Embedded Enterprise Exchange about my new first year module. I used a Hangout on air during the talk and you can see it here:

    That's not what I'm going to talk about here though. The course I talked baout ends with all the 'companies' (groups of 4 students) giving a $\approx 25$ minute talk.

    I need to schedule 40ish talks and this needs to fit around the student availability as well as my own. In this post I'll describe how I did this using a combination of Doodle, +Sage Mathematical Software System and Graph Theory.

    The beginning of this post will be some of the boring details but towards the end I start talking about the mathematics (so feel free to skip to there...).

    First of all I needed to know my students availability.

    For this I simply used Doodle: https://doodle.com. I kind of use Doodle for every meeting I have to schedule (they also offer a cool tool that lets you show your availability so students/colleagues have an indication of when I might be able to meet with them).

    Here's a screenshot of the responses:

    You can't really see anything there as I had to zoom out a lot to grab the whole picture. Doodle allows you to download the information for any given poll in .xls format so I could relatively easily obtain the biadjacency matrix $M$ for my problem. Where $M_{ij}$ is 1 if group $i$ is available for schedule slot $j$ and 0 otherwise.

    The mathematics and code needed.

    Once I've got a .csv file (by tweaking the .xls file) of the biadjacency matrix I import that in to +Sage Mathematical Software System  and convert it to an instance of the `Matrix` class using the following:

    import csv 
    data = [[int(j) for j in row] for row in csv.reader(f)] 
    M = Matrix(data)

    I then need to remove any particular scheduled slots that are not picked by any company:

    M = matrix([row for row in M.transpose() if max(row) != 0]).transpose()

    Once I've done this I can define the bi-partite graph (bi-partite simply means that the vertices can be separated in to 2 non adjacent collections):

    g = BipartiteGraph(M)

    We can then get a get a picture of this, I do this using a 'partition' (a graph colouring) that will colour the groups (red) and the schedule slots (blue):

    g = BipartiteGraph(M)
    p = g.coloring()
    g.show(layout='circular',dist=1,vertex_size=250, graph_border=True, figsize=[15,15],partition=p)

    The various options I pass to the `show` command are simply to get the circular arrangement (and other minor things):

    The above looks quite messy and what I essentially want is get as many pairwise matchings between the blue vertices (slots) and red vertices (companies) so that each schedule slot is attributed at most 1 company and every company has at least 1 schedule slot.

    On any given graph $G=(V,E)$ this problem is known as looking for a maximal matching and can be written down mathematically:

    Max:  $\sum_{e \in E(G)} m_e$
    Such that:  $\forall v$ $\sum_{e \in E(G) \atop v \sim e} m_e \leq 1$

    We are in essence finding a subset of edges of our original graph in such a way as to maximise the number of edges such that no vertex has more than 1 edge.

    This is all explained extremely well at the +Sage Mathematical Software System documentation pages here.

    Furthermore at the documentation the code needed to solve the problem is also given:

    p = MixedIntegerLinearProgram()  
    matching = p.new_variable(binary=True)
    p.set_objective(sum(matching[e] for e in g.edges(labels=False)))
    for v in g:
              for e in g.edges_incident(v, labels=False)) <= 1)

    When I run the above, `p` is now a solved Mixed Integer Linear Program (corresponding to the matching problem described). To obtain the solution:

    matching = p.get_values(matching)
    schedule = [e for e,b in matching.iteritems() if b == 1]

    Calling `schedule` gives a set of edges (denoted by the corresponding vertex numbers):

    [(5, 57), (0, 45), (23, 50), (4, 42), (38, 60), (26, 56), (34, 62), (16,
    68), (1, 43), (7, 40), (9, 44), (36, 58), (12, 49), (35, 71), (28, 66),
    (25, 47), (24, 53), (6, 46), (3, 64), (39, 67), (17, 69), (22, 55), (13,
    48), (33, 41), (10, 63), (21, 61), (30, 52), (29, 65), (37, 70), (15,
    54), (19, 51), (11, 59)]

    It is then really easy to draw another graph:

    p = B.coloring()
    B.show(layout='circular',dist=1,vertex_size=250, graph_border=True, figsize=[15,15],partition=p)

    Which gives:

    You can see that the obtained graph has all the required properties and most importantly is a lot less congested.

    Some details.

    I'm leaving out some details, for example I kept track of the names of the companies and also the slots so that the final output of all this looked like this:

    4InARow: Fri1100
    Abacus: Tue0930
    Alpha1: Thu1130
    Alpha2: Mon0930
    AusfallLtd: Fri1230
    AxiomEnterprise: Thu1000
    Batduck: Thu1430
    CharliesAngles: Thu1500
    CwmniRhifau: Mon1330
    EasyasPi: Fri1130
    Evolve: Thu1300
    HSPLLtd.: Mon1030
    JECT: Tue1200
    JJSL: Thu1030
    JNTL: Mon1400
    JennyCash: Tue1630
    KADE: Fri1330
    MIAS: Fri1300
    MIPE: Thu1100
    MLC: Tue1600
    Nineties: Mon0900
    Promis: Fri1400
    R.A.C.H: Tue1530
    RYLR: Tue1230
    SBTP: Fri1030
    Serendipity: Mon1230
    UniMath: Tue1300
    VectorEnterprises: Tue1330
    Venus: Thu1400
    codeX: Mon1300
    dydx: Wed1630
    eduMath: Thu0930

    (BatDuck is my favourite company name by far...)

    Why did I do this this way?

    There are 3 reasons:

    1. I shared the schedule with my students through a published Sage sheet on our server. That way they can see a direct applied piece of mathematics and can also understand some of the code if they wanted to.
    2. "Point and click doesn't scale" - I'm sure I could have solved this one instance of my problem with pen and paper and some common sense faster than it took me to write the code to solve the problem. The thing is next year when I need to schedule these talks again it will at most take me 3 minutes as the code is all here and ready to go. (Most readers of this blog won't need that explained but if any of my students find their way here: that's an important message for you).
    3. It was fun.

    by Vincent Knight (noreply@blogger.com) at March 27, 2014 03:35 PM

    March 14, 2014

    Lee Worden

    Sage and WorkingWiki demo: High-level analysis of a population dynamics model

    Publish Date: 
    Fri, 03/14/2014 - 13:39

    State-space diagram produced by the Sage framework

    I've been developing a framework in Sage to work with the kind of dynamic models that my collaborators and I use a lot of the time in population biology and social-science projects. Here's a demo of what it can do:

    by worden at March 14, 2014 08:55 PM

    March 06, 2014

    Sébastien Labbé

    Demo of the IPython notebook at Sage Paris group meeting

    Today I am presenting the IPython notebook at the meeting of the Sage Paris group. This post gathers what I prepared.


    First you can install the ipython notebook in Sage as explained in this previous blog post. If everything works, then you run:

    sage -ipython notebook

    and this will open a browser.

    Turn on Sage preparsing

    Create a new notebook and type:

    In [1]: 3 + 3
    In [2]: 2 / 3
    In [3]: matrix
    Traceback (most recent call last):
    NameError: name 'matrix' is not defined

    By default, Sage preparsing is turn off and Sage commands are not known. To turn on the Sage preparsing (thanks to a post of Jason on sage-devel):

    %load_ext sage.misc.sage_extension

    Since sage-6.2, according to sage-devel, the command is:

    %load_ext sage

    You now get Sage commands working in ipython:

    In [4]: 3 + 4
    Out[4]: 7
    In [5]: 2 / 3
    Out[5]: 2/3
    In [6]: type(_)
    Out[6]: <type 'sage.rings.rational.Rational'>
    In [7]: matrix(3, range(9))
    [0 1 2]
    [3 4 5]
    [6 7 8]

    Scroll and hide output

    If the output is too big, click on Out to scroll or hide the output:

    In [8]: range(1000)

    Sage 3d Graphics

    3D graphics works but open in a new Jmol window:

    In [9]: sphere()

    Sage 2d Graphics

    Similarly, 2D graphics works but open in a new window:

    In [10]: plot(sin(x), (x,0,10))

    Inline Matplotlib graphics

    To create inline matplotlib graphics, the notebook must be started with this command:

    sage -ipython notebook --pylab=inline

    Then, a matplotlib plot can be drawn inline (example taken from this notebook):

    import matplotlib.pyplot as plt
    import numpy as np
    x = np.linspace(0, 3*np.pi, 500)
    plt.plot(x, np.sin(x**2))
    plt.title('A simple chirp');

    Or with:

    %load http://matplotlib.org/mpl_examples/showcase/integral_demo.py

    According to the previous cited notebook, it seems, that the inline mode can also be decided from the notebook using a magic command, but with my version of ipython (0.13.2), I get an error:

    In [11]: %matplotlib inline
    ERROR: Line magic function `%matplotlib` not found.

    Use latex in a markdown cell

    Change an input cell into a markdown cell and then you may use latex:

    Test $\alpha+\beta+\gamma$

    Output in latex

    The output can be shown with latex and mathjax using the ipython display function:

    from IPython.display import display, Math
    def my_show(obj): return display(Math(latex(obj)))
    y = 1 / (x^2+1)

    ipynb format

    Create a new notebook with only one cell. Name it range_10 and save:

    In [1]: range(10)
    Out[1]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

    The file range_10.ipynb is saved in the directory. You can also download it from File > Download as > IPython (.ipynb). Here is the content of the file range_10.ipynb:

     "metadata": {
      "name": "range_10"
     "nbformat": 3,
     "nbformat_minor": 0,
     "worksheets": [
       "cells": [
         "cell_type": "code",
         "collapsed": false,
         "input": [
         "language": "python",
         "metadata": {},
         "outputs": [
           "output_type": "pyout",
           "prompt_number": 1,
           "text": [
            "[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]"
         "prompt_number": 1
         "cell_type": "code",
         "collapsed": false,
         "input": [],
         "language": "python",
         "metadata": {},
         "outputs": []
       "metadata": {}

    ipynb is just json

    A ipynb file is written in json format. Below, we use json to open the file `range_10.ipynb as a Python dictionnary.

    sage: s = open('range_10.ipynb','r').read()
    sage: import json
    sage: D = json.loads(s)
    sage: type(D)
    sage: D.keys()
    [u'nbformat', u'nbformat_minor', u'worksheets', u'metadata']
    sage: D
    {u'metadata': {u'name': u'range_10'},
     u'nbformat': 3,
     u'nbformat_minor': 0,
     u'worksheets': [{u'cells': [{u'cell_type': u'code',
         u'collapsed': False,
         u'input': [u'range(10)'],
         u'language': u'python',
         u'metadata': {},
         u'outputs': [{u'output_type': u'pyout',
           u'prompt_number': 1,
           u'text': [u'[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]']}],
         u'prompt_number': 1},
        {u'cell_type': u'code',
         u'collapsed': False,
         u'input': [],
         u'language': u'python',
         u'metadata': {},
         u'outputs': []}],
       u'metadata': {}}]}

    Load vaucanson.ipynb

    Download the file vaucanson.ipynb from the last meeting of Paris Sage Users. You can view the complete demo including pictures of automaton even if you are not able to install vaucanson on your machine.

    IPython notebook from a Python file

    In a Python file, separate your code with the following line to create cells:

    # <codecell>

    For example, create the following Python file. Then, import it in the notebook. It will get translated to ipynb format automatically.

    # -*- coding: utf-8 -*-
    # <nbformat>3.0</nbformat>
    # <codecell>
    %load_ext sage.misc.sage_extension
    # <codecell>
    matrix(4, range(16))
    # <codecell>

    More conversion

    Since release 1.0 of IPython, many conversion from ipynb to other format are possible (html, latex, slides, markdown, rst, python). Unfortunately, the version of IPython in Sage is still 0.13.2 as of today but the version 1.2.1 will be in sage-6.2.

    by Sébastien Labbé at March 06, 2014 10:47 AM

    February 25, 2014

    The Matroid Union

    Google Summer of Code 2014

    Google’s Summer of Code is an annual event in which Google will pay a bunch of students to spend the summer writing code for open-source projects.

    As last year, Sage is one of the mentoring organizations, and as last year, we are hoping to have one of about 4 students work on Sage’s matroid capabilities. We are looking for an undergraduate or graduate student who

    • Knows a bit of matroid theory (some possible topics require more knowledge than others)
    • Has some programming experience, preferably in Python
    • Wants to work on building exciting mathematics software for 3 months during the summer
    • Would like the word “Google” on his or her CV
    • Is interested in earning 5500 US dollars and a t-shirt

    On the ideas page, a list of possible features to work on can be found. Students must produce a detailed proposal including a timeline with milestones. Early interaction with the mentors is important: we want to know you’re serious about this, and able to deliver. The place for this is the sage-gsoc Google group.

    If you know an interested student, send her or him to this post. If you are an interested student, study the links above to figure out the requirements, timeline, etc. One important date: applications must be in on

    MARCH 21

    and we’d like to start working on the proposal with you well before then.

    by Stefan van Zwam at February 25, 2014 07:44 PM

    February 21, 2014

    Vince Knight

    Best responses to mixed strategies in class

    On Monday my Game Theory class and I took a look the connection between extensive form games and normal form games (leading to subgame perfection) which correspond to these two chapters of my course: C7 and C8 but before starting that we took another look at best responses to mixed strategies (this Chapter of my course).

    We have been using this game quite a bit in class:

    (2,-2) & (-2,2)\\
    (-1,1) & (1,-1)

    We played it before and I blogged about it here. This is a slight modification of the matching pennies game where the 1st strategy corresponds to playing Heads (\(H\)) and the second to playing Tails (\(T\))

    If player 1 (the row player) is playing a mixed strategy \(\sigma_1=(x, 1-x)\) then the utility to player 2 when playing player 2 plays $H$ (the first column) can be written as:


    and when player 2 plays $T$:


    We can plot these two utilities here (using +Sage Mathematical Software System):

    It is immediate to note that when \(x < 1/3\) player 2 should play $T$. In fact we can write down player 2's best response \(s_2^*\) to any \(\sigma_1\):

    H,&x < 1/3\\
    T,&x > 1/3\\

    Using all this I played the following game in class:

    • I handed out sheets asking students to play against 3 separate mixed strategies \(\sigma_1\in\{(.2,.8),(.9,.1),(1/3,2/3)\}\). I will refer to these 3 rounds as R1, R2 and R3;
    • Students (acting as player 2) filled in their strategies;
    • I then used the following interact to sample mixed strategies according to \(\sigma_1\):

    I changed the value of \(x\) as required.

    Here are the three row strategies that were sampled:

    • R1: TTTTTH 
    • R2: HHHTHH 
    • R3: TTHTTT 

    This is obviously not giving the exact proportions dictated by the mixed strategy \(\sigma_1\) but that's also kind of the point. By round, here are the results.


    Here's a plot of the mixed strategy that was played by the entire class during round 1:

    This corresponds to \(\sigma_2=(.70,.30)\), so most students seemed willing to 'trust the theory' that one should play $H$ against this mixed strategy.

    4 students scored the highest score (\(7\)) and here's the strategy they all played: \(HHHHHT\), in essence they got lucky and maxed out what they could have had. If they had played the theoretical best response (to only play $H$) they would have scored: 3.

    The expected value of playing the theoretical best response (always pick \(H\) against this mixed strategy is: \(6(1-3\times.2)=2.4\) (recall that \(\sigma_1=(.2,.8)\) for this round).

    The mean score for this round was 1.4 and here's a distribution of the scores:

    47 students who 'won' ie scored a positive score (this is a zero zum game) played \(\sigma_2=(.83,.17)\). 18 'lost' (scored a negative score) playing \(.37,.63\).

    It's nice to see that there was a large amount of students who did in fact score 3.


    Here's a plot of the mixed strategy that was played by the entire class during round 2:

    This corresponds to \(\sigma_2=(.12,.88)\), which is again pretty close to the theoretical best response.

    2 students scored the highest score: 11. They once again lucked out and played the perfect response: \(TTTHTT\). If they had played the theoretical best response they would have scored 9.

    The expected value of playing the theoretical best response (always pick \(T\) against this mixed strategy is: \(6(3\times.9-1)=10.2\) (recall that \(\sigma_1=(.9,.1)\) for this round).

    The mean score for this round was  6.9 and here's a distribution of the scores:

    60 students 'won' ie scored a positive score (this is a zero zum game) playing \(\sigma_2=(.07,.93)\). 5 'lost' (scored a negative score) playing \(.77,.23\).


    The third round had the students playing against a mixed strategy for which they should have been indifferent. Here's how they played:

    This corresponded to \(\sigma_2=(0.62,.38)\).

    There were 10 winners for this game and they scored 10 (quite a few strategy profile gave this score so I won't list them but they mainly took advantage of the fact that mostly $T$ was sampled). (The theoretical utility is in fact 0 as you can see with one of the plots above).

    The mean score for this round was was .4 (which is quite close to the theoretical value of 0). Here's the distribution of the scores:

    28 scored positively playing \(\sigma_2=(.64,.36)\) and 37 scored negatively playing \(\sigma_2=(.77,.23)\).

    What's nice to see here is that this 3rd round is a bit more random, with an almost (stretching the definition of the word almost) equal distribution between the number of students who won and lost.

    Here's a distribution of the overall scores:

    The overall winner of the game (who scored the most over the 3 rounds) was Becky who played:

    • R1: \(TTHHHH\)
    • R2: \(TTTTTT\)
    • R3: \(HHTHTH\)

    For a cumulative score of: 21

    This was good fun to analyse and was hopefully useful to my students to see what is meant by best responses to mixed strategies. It was particularly cool to see an 'indifferent' (again stretching the definition of the word indifferent) response to the third round.

    (Like with everything for this course you can find the data, analysis scripts and everything else at this github repo)

    by Vincent Knight (noreply@blogger.com) at February 21, 2014 03:58 AM

    February 16, 2014

    David Horgan (quantumtetrahedron)

    Enhanced by Zemanta

    One of the things  I’ve been  working on this week  is : Moduli Space of Shapes of a Tetrahedron with Faces of Equal Area.

    This follows on the work done in the posts:

    The space of shapes of a tetrahedron with fixed face areas is naturally a symplectic manifold of real dimension two. This symplectic manifold turns out to be a Kahler manifold and can be parametrized by a single complex coordinate Z given by the cross ratio of four complex numbers obtained by stereographically projecting the unit face normals onto the complex plane.

    This post shows how this works in the simplest case of a tetrahedron T whose four face areas are equal. For convenience, the cross-ratio coordinate Z is shifted and rescaled to z=(2Z-1)/Sqrt[3] so that the regular tetrahedron corresponds to z=i, in which case the upper half-plane is mapped conformally into the unit disc w=(i-z)/(i+z).The equi-area tetrahedron T is then drawn as a function of the unit disc coordinate w.



    Enhanced by Zemanta

    by David Horgan at February 16, 2014 04:12 PM

    February 12, 2014

    William Stein

    What is SageMathCloud?

    The two main reasons for existence of SageMathCloud (SMC) are...

    Goal 1. Increase resource for Sage: Generate a different longterm revenue stream to support development of Sage, i.e., open source mathematical software. By "different", I mean different than government and foundation grants and donations, which are relatively limited for primarily pure mathematics software development, which is what Sage specializes in. Even in my wildest dreams, it is very unlikely Sage will get more than a million dollars a year in funding (and in practice it gets a lot less); however, a successful commercial product with wide adoption has the potential to generate significantly more than a million dollars a year in revenue -- of course most would go back into the product... but when the product is partly Sage, that's fine. The National Science Foundation (and other donors) have played a major part during the last 8 years in funding Sage, but I think everybody would benefit from another funding source.

    Goal 2. Increase the usage of Sage: The number of unique visitors per month to http://sagemath.org grew nicely from 2005 (when I started Sage) until Summer 2011, after which point it has remained fairly constant at 70,000 unique visitors. There is no growth at all: it was 70,332 in Jan 2011, and it was 70,449 last month (Jan 2014), both with a bounce rate of about 50%. A significant obstruction to growth is accessible, which SMC helps to address for certain users (last month the SMC website has 17,700 unique visitors with a bounce rate of about 30%).

    Here's an actual email I received from somebody literally as I was writing this, which I think illustrates how SMC addresses the second goal:

    Hey William,

    Today I stopped by cloud.sagemath.com because
    I wanted to do some computation with sage, and
    cloud is announced in a big way on sagemath.org

    This is after a lengthy hiatus from computing
    with sage ( maybe a year ).

    Using cloud.sagemath.com completely blew my
    mind. At first I did not really understand
    why sagenb was ditched after all the work that
    went into it. But man, cloud is really a
    pleasure to use !

    I just wanted to share the joy :)

    Thanks for all that you do !

    Licensing and Reuse of the SageMathCloud Codebase

    The design and coding of SageMathCloud (SMC) has been mostly supported by University of Washington (UW). Due to goal 1 above, I have been working from the start (before a line of code was written) with the commercialization/tech transfer office of UW, who (because of 1) are not enthusiastic about simply open source the whole SMC codebase, as a condition for their help with commercialization. Some of SMC is open sourced, mainly the code that runs on the VM's and some of the HTML5 client that runs on the browser. We also plan to make the HTML5 client and a mini server BSD licensed, and include them with Sage (say) as a new local graphical interface. Of course SMC builds on top of many standard open source libraries and tools (e.g., CodeMirror, Cassandra, ZFS, Node.js, etc.).

    There is, however, a large amount of interesting backend code, which is really the "cloud" part of SMC, and which we do not intend to release as open source. We do intend to sell licenses (with support) for the complete package, when it is sufficiently polished, since many organizations want to run their own private SMC servers, mainly for confidentiality reasons.

    Goal 2 above mainly impacts how we market SMC. However, it's easy to completely ignore Sage and still get a lot of value out of SMC. I just glanced at what people are doing as I write this, and the result seems pretty typical: latex'ing documents, some Sage worksheets, some IPython notebooks, editing a perl script.

    It's important to understand how SMC is different than other approaches to cloud computing. It's designed to make certain things very easy, but they are quite different things than what "traditional" cloud stacks like OpenStack are designed to make easy. SMC is supposed to make the following easy:

    • using Sage and IPython, both command line and notebook interfaces.
    • writing a paper using LaTeX (possibly with a specific private list of collaborators),
    • editing source code, e.g., developing Python/C/etc., libraries., again possibly with realtime collaboration.
    • creating collaborative "projects", which are really a Linux account on a machine, and provide isolation from other projects.
    • backups: all data is automatically snapshotted frequently
    • high availability: failure of a machine (or even whole data center) results in at most a few minutes of lost time/work.
    • speed: files are stored on a compressed local filesystem, which is snapshotted and replicated out regularly; thus the filesystem feels fast and is scalable, as compared to a networked filesystem.

    The above design goals are useful for certain target audiences, e.g., people doing Sage/Python/etc. development, teachers and students in courses that make use of Sage/Python/etc., collaborative math research projects. SMC is designed so that a large number of people can make simultaneous small use of ever-expanding resources. SMC should also fully support the "social networks" that form in this context. At the same time, it's critical that SMC have excellent uptime and availability (and offsite backups, just in case), so that people can trust it. By trust, I don't mean so much in the sense of "trust it with proprietary info", but in the sense of "trust it to not just loose all my data and to be there when I'm giving a talk/teaching a class/need to do homework/etc.".

    However, exactly the above design goals are at odds with some of goals of large-scale scientific/supercomputing. The following are not design goals of SMC:

    • supercomputing -- have large data that many distributed processes operate on: exactly what people often do on supercomputers (or with Hadoop, etc.)
    • traditional "cloud computing" -- dynamically spin up many VM's, run computations on them; then destroy them. With SMC, things tend to get created but not destroyed (e.g., projects and files in them), and a full VM is much too heavy given the number of users and type of usage that we have already (and plan to have).

    What happens in practice with SMC is that people run smaller-scale computations on SMC (say things that just take a few cores), and when they want to run something bigger, they ssh from SMC to other resources they have (e.g., a supercomputer account) and launch computations there. All project collaborators can see what anybody types in a terminal, which can be helpful when working with remote compute clusters.

    Anyway, I hope this helps to clarify what exactly SMC actually is.

    by William Stein (noreply@blogger.com) at February 12, 2014 06:41 AM

    February 04, 2014

    Sébastien Labbé

    Dessins et calculs d'orbites avec Sage d'une fonction associée à l'algo LLL

    Aujourd'hui avait lieu une rencontre de l'ANR DynA3S. Suite à une présentation de Brigitte Vallée, j'ai codé quelques lignes en Sage pour étudier une fonction qu'elle a introduite. Cette fonction est reliée à la compréhension de la densité des termes sous-diagonaux dans l'exécution de l'algorithme LLL.

    D'abord voici mon fichier: brigitte.sage.

    Pour utiliser ce fichier, il faut d'abord l'importer dans Sage en utilisant la commande suivante. En ligne de commande, ça fonctionne bien. Dans le Sage notebook, je ne sais plus trop si la commande load permet encore de le faire (?):

    sage: %runfile brigitte.sage       # not tested

    On doit générer plusieurs orbites pour visualiser quelque chose, car les orbites de la fonction sont de taille 1, 2 ou 3 en général avant que la condition d'arrêt soit atteinte. Ici, on génère 10000 orbites (les points initiaux sont choisis aléatoirement et uniformément dans \([0,1]\times[-0.5, 0.5]\). On dessine les derniers points des orbites:

    sage: D = plusieurs_orbit(10000)
    Note: la plus longue orbite est de taille 3
    sage: A = points(D[0], color='red', legend_label='derniers')
    sage: B = points(D[1], color='blue', legend_label='avant derniers')
    sage: C = points(D[2], color='black', legend_label='2e avant derniers')
    sage: G = A + B + C
    sage: G.axes_labels(("$x$", r"$\nu$"))
    sage: title = r"$(x,\nu) \mapsto (\frac{x}{(x+\nu^2)^2},\frac{\nu}{(x+\nu^2)})$"
    sage: G.show(title=title, xmax=2)

    Un raccourci pour faire à peu près le même dessin que ci-haut:

    sage: draw_plusieurs_orbites(10000).show(xmax=2)

    On dessine des histogrammes surperposés de la densité de ces points une fois projetés sur l'axe des \(\nu\):

    sage: histogrammes_nu(10000, nbox=10)

    Le dessin semble indiquer que la densité non uniforme semble provenir simplement par les points \((x,\nu)\) tels que \(x\leq 1\).

    On dessine des histogrammes superposés de la densité de ces points une fois projetés sur l'axe des \(x\) (on donne des couleurs selon la valeur de \(\nu\)):

    sage: histogrammes_x(30000, nbox=5, ymax=1500, xmax=8)

    Le dessin semble indiquer que la densité ne dépend pas de \(\nu\) pour \(x\geq 1\).

    by Sébastien Labbé at February 04, 2014 08:00 AM

    February 03, 2014

    David Horgan (quantumtetrahedron)

    Enhanced by Zemanta

    This week I have been  reviewing the new spinfoam vertex in 4d models of quantum gravity. This was discussed in the recent posts:

    In this post I explore the large spins asymptotic properties of the overlap coefficients:

    characterizing the holomorphic intertwiners in the usual real basis. This consists of the normalization coefficient times the shifted Jacobi polynomial.

    In the case  of n = 4. I can study the asymptotics of the shifted Jacobi polynomials in the limit ji → λji, λ → ∞.  A  convenient integral representation for the shifted Jacobi polynomials is given by a contour integral:


    This leads to the result that:

    This formula relates the two very different descriptions of the phase space of shapes of a classical tetrahedron – the real one in terms of the k, φ parameters and the complex one in terms of the cross-ratio
    coordinate Z. As is clear from this formula, the relation between the two descriptions is non-trivial.

    In this post I have only worked with the simplest case of this relation when all areas are equal. In this ‘equi-area‘ case where all four representations are equal ji = j, ∀i = 1, 2, 3, 4, as described in the post: Holomorphic Factorization for a Quantum Tetrahedron the overlap function is;


    Using sagemath I am able to evaluate the overlap coefficients for various values of j and the cross-ratios z.



    Here I plot the modulus of the equi-area case state Ck, for j = 20, as a function of the spin label k, for the value of the cross-ratio Z = exp(iπ/3) that corresponds to the equilateral tetrahedron. It is obvious that the distribution looks Gaussian. We also see that the maximum is reached for kc = 2j/√3 ∼ 23, which agrees with an asymptotic analysis.

    Here I plot the modulus of the equi-area case state Ck for various j values as a function of the spin label k, for the value of the cross-ratio Z = exp(iπ/3) that corresponds to the equilateral tetrahedron.


    Here I have  have plotted the modulus of the j = 20 equi-area state Ck for increasing cross-ratios Z = 0.1i, 0.8i, 1.8i. The Gaussian distribution progressively moving its peak from 0 to 2j. This illustrates how changing the value of Z affects the semi-classical geometry of the tetrahedron.


    In this post I we have studied a holomorphic basis for the Hilbert space Hj1,…,jn of SU(2) intertwiners. In particular I have looked at the case of 4-valent intertwiners that can be interpreted as quantum states of a quantum tetrahedron. The formula

    gives the inner product in Hj1,…,jn in terms of a holomorphic integral over the space of ‘shapes’ parametrized by the cross-ratio coordinates Zi. In the tetrahedral n = 4 case there is a single cross-ratio Z. The n=4 holomorphic intertwiners parametrized by a single cross-ratio variable Z are coherent states in that they form an over-complete basis of the Hilbert space of intertwiners and are semi-classical states peaked on the geometry of a classical tetrahedron as shown by my numerical studies. The new holomorphic intertwiners are related to the standard spin basis of intertwiners that are usually used in loop quantum gravity and spin foam models, and the change of basis coefficients are given by Jacobi polynomials.

    In the canonical framework of loop quantum gravity, spin network states of quantum geometry are labeled by a graph as well as by SU(2) representations on the graph’s edges e and intertwiners on its vertices v. It is now possible to put holomorphic intertwiners at the vertices of the graph, which introduces the new spin networks labeled by representations je and cross-ratios Zv. Since each holomorphic intertwiner can be associated to a classical tetrahedron, we can interpret these new spin network states as discrete geometries. In particular, geometrical observables such as the volume can be expected to be peaked on their classical values as shown in my numerical studies for j=20. This should be of great help when looking at the dynamics of the spin network states and when studying how they are coarse-grained and refined.

    Enhanced by Zemanta

    by David Horgan at February 03, 2014 12:15 AM

    January 23, 2014

    Vince Knight

    My Game Theory YouTube Playlist and other resources

    I just added Graham Poll's awesome +YouTube playlist (http://goo.gl/UZ1Ws) to my "reading" list for my Game Theory course that I'm teaching on Monday and thought that I should also include the humble videos related to Game Theory that I have on my channel:

    I also thought I could get away with making a blog post about this. The playlist above has them in 'last in first out' order but here they are in the order that I made them:

    1. "An introduction to mixed strategies using Sage math's interact page."

    A video that looks at the 'battle of the sexes' game and also shows of a +Sage Mathematical Software System interact.

    2.  "Selfish Behaviour in Queueing Systems"

    A video close to my research interests which look at the intersection of Game Theory and Queueing Theory. This video is actually voiced by +Jason Young who was doing his first research internship at the time with me and will be starting his PhD at the beginning of the 2014/2015 academic year.

    3. "Pigou's Example"

    A video describing a type of Game called a 'routing game'. Pigou's example is a particular game that shows the damaging effect of selfish (rational) behaviour in a congestion affected system. This video also comes with a bit of +Sage Mathematical Software System code.

    4. "Calculating a Tax Fare using the Shapley Value"

    This is one of my most popular videos despite the error that +Brandon Hurr pointed out at 3:51. It describes a basic aspect of Cooperative Game Theory and uses the familiar example of needing to share a taxi fare as an illustration.

    5. "Using agent based modelling to identify emergent behaviour in game theory"

    This video shows off some Python code that I've put online that allows the creation of a population of players/agents that play any given normal form game. There are some neat animations showing the players choosing different strategies as they go.

    6. "OR in Schools - Game Theory activity"

    This isn't actually a video of mine. It is on +LearnAboutOR 's channel but it's a 1hr video of one of the outreach events I do which gets kids/students using Game Theory.

    7. "Selfish behaviour in a single server queue"

    I built a simulation of a queue (Python code here) with a graphical representation (so you see dots going across the screen). This video simply shows what it can do but also shows how selfish behaviour can have a damaging effect in queues.

    I'm going to be putting together (fingers crossed: time is short) a bunch more over the coming term.

    by Vincent Knight (noreply@blogger.com) at January 23, 2014 09:59 AM

    January 16, 2014

    David Horgan (quantumtetrahedron)

    Enhanced by Zemanta

    In this post, a theory of quantum gravity called Causal Dynamical Triangulation (CDT) is explored. The 1+1 dimensional universe, the simplest case of an empty universe of one spatial and one temporal dimension is simulated in sagemath. This post explains CDT in general and presents and explains the results of the 1+1 dimensional simulations.

    Understanding gravity at the fundamental level is key to a deeper understanding of the workings of the universe. The problem of unifying Einstein’s theory of General Relativity with Quantum Field Theory is an unsolved problem at the heart of understanding how gravity works at the fundamental level. Various attempts have been made so far at solving the problem. Such attempts include String Theory, Loop Quantum Gravity, Horava-Lifshitz gravity, Causal Dynamical Triangulation as well as others.

    Causal Dynamical Triangulation is an to quantum gravity that recovers classical spacetime at large scales by enforcing causality at small scales. CDT combines quantum physics with general relativity in a Feynman sum over-geometries and converts the sum into a discrete statistical physics problem. I solve this problem using a Monte Carlo simulation to compute the spatial fluctuations of an empty universe with one space and one time dimensions. The results compare favourably with theory and provide an accessible but detailed introduction to quantum gravity via a simulation that runs on a computer.

    In order to use the CDT approach, the Einstein-Hilbert action of General Relativity and the path integral approach to Quantum Field Theory are both important. I’ll begin by introducing both concepts as well as the metric and the Einstein Field equations. In this post I attempt, at least briefly, to explain CDT in general and explain what I have found with my simulation.

     Quantum gravity

    Theories of quantum gravity attempt to unify quantum theory with general relativity, the theory of classical gravity as spacetime curvature. Superstring theory tries to unify gravity with the electromagnetic and weak and strong nuclear interactions, but it requires supersymmetry and higher dimensions, which are as yet unobserved. It proposes that elementary particles are vibrational modes of strings of the Planck length and classical spacetime is a coherent oscillation of graviton modes.

    Loop quantum gravity does not attempt to unify gravity with the other forces, but it directly merges quantum theory and general relativity to conclude that space is granular at the Planck scale. It proposes that space is a superposition of networks of quantised loops and spacetime is a discrete history of such networks.

    Causal Dynamical Triangulation is a conservative approach to quantum gravity that constructs spacetime from triangular-like building blocks by gluing their time-like edges in the same direction. The microscopic causality inherent in the resulting spacetime foliation ensures macroscopic space and time as we know it. Despite the discrete foundation, CDT does not necessarily imply that spacetime itself is discrete. It merely grows a combinatorial spacetime from the building blocks according to a propagator fashioned from a sum-over-histories superposition. Dynamically generating a classical universe from quantum fluctuations has been accomplished.CDT recovers classical gravity at large scales, but predicts that the number of dimensions drops continuously from 4 to 2 at small scales. Other approaches to quantum gravity have also predicted similar dimensional reductions from 4 to 2 near the Planck scale.

     Classical gravity

    In the theory of relativity, space  and time  are on equal footing and mix when boosting between reference frames in relative motion. In the flat Minkowski spacetime of special relativity, the invariant proper space dr  and time ds  between two nearby events follow from the generalised pythagorean theorem.

    2dcdtequ1which involves the difference in the squares of the relative space dx and time dt  between the events in some reference frame. This difference prevents causality violation, because light cones defined by dr = 0 or dt =+/-dx partition spacetime into invariant sets of past and future at each event. Free particles  follow straight trajectories or world-lines x[t] of stationary proper time. In the curved spacetime of general relativity, the separation between two events follows from

    2dcdtequ2where the metric g  encodes the spacetime geometry. Free particles follow curved world lines of stationary proper time. The vacuum gravitational field equations can be derived from the Einstein-Hilbert scalar action


    where the Gaussian curvature is half the Ricci scalar curvature, K = R/2, and Lambda is the cosmological constant. By diagonalizing the metric, the invariant area element


    where g = det g.  Demanding that the action be stationary with respect to the metric, dS/dg = 0, implies the gravitational field equations


    where the Ricci tensor curvature is the variation of the Ricci scalar with the metric, R = dR/dg. The Einstein curvature G is proportional to the cosmological constant, which can be interpreted as the vacuum energy density.

    Quantum mechanics

    In classical mechanics, the action


    is the cumulative difference between a particle’s kinetic energy T and potential energy V[x]. A particle of mass m follows a worldline x[t] of stationary action. Demanding that the action be stationary with respect to the worldline, dS/dx = 0, implies Newton’s equation


    In quantum mechanics, a particle follows all worldlines. Along each worldline, it accumulates a complex amplitude whose modulus is unity and whose phase is the classical action S[x] in units of the quantum of action h. The Feynman propagator, or amplitude to transition from place a to place b, is the sum-over-histories superposition


    where Dx denotes the integration measure. The corresponding probability:


    is the absolute square of the amplitude. If the wave function 2dcdtequ9ais the amplitude to be at a place b, and the kinetic energy


    then the path integral between infinitesimally separated places implies the nonrelativistic Schrodinger wave equation


    In quantum gravity, the corresponding sum is over all spacetime geometries and the quantum phase of each geometry is the Einstein-Hilbert action. The probability amplitude to transition from one spatial geometry to another is


    2dcdtequ11ais the probability amplitude of a particular spatial geometry, then this path integral implies the timeless Wheeler-DeWitt equation:



     Gauss–Bonnet theorem

    In 1 + 1 = 2 dimensions, the Gauss–Bonnet theorem  dramatically simplifies the Einstein-Hilbert action by relating the total curvature of an orientable closed surface to a topological invariant. The curvature of a polyhedron is concentrated at its corners where the deficit angle

    2dcdtequ13and the total curvature


    where Kv is the discrete Gaussian curvature at vertex v, and av is the area closer to that vertex than any other. The Gaussian curvature of a circle of radius r is the reciprocal of its radius 1/r. The Gaussian curvature at a point on a surface is the product of the corresponding minimum and maximum sectional curvatures. Hence, the total curvature of a sphere


    like the topologically equivalent cube. More generally,


    where X is the surface’s Euler characteristic and the genus G is its number of holes. For a sphere G = 0, and for a torus G = 1. Total curvature can only change discretely and only by changing the number of holes.

    Wick rotation

    The sum over histories path integral is difficult to evaluate because of the oscillatory nature of its integrand. A Wick rotation converts this difficult problem in Minkowski spacetime to a simpler one in Euclidean space by introducing an imaginary time coordinate. For example,


    One motivation for the Wick rotation comes from complex analysis, where the integral of an analytic function f[z] around a closed curve in the complex plane always vanishes. If the function also decreases sufficiently rapidly at infinity, then around the contour C,


    and so


    which effectively rotates the integration 90o in the complex plane. Multiplying a complex number by similarly rotates the number 90o in the complex plane. The Wick rotation maps the line element by


    and the area element by


    This gives the  Einstein-Hilbert action by


    and the the probability amplitude by


    This Wick rotation converts the Feynman phase factor to the Boltzmann weight thereby connecting quantum and statistical mechanics, where toroidal boundary conditions and a positive cosmological constant Lambda > 0 ensure the negativity of the Euclidean action Se < 0.

    Regge calculus

    Triangulation can simplify the continuum to the discrete. The equilateral Euclidean triangle has height


    and area



    Regge calculus approximates curved spacetime by flat Minkowski triangles (or simplexes in higher dimensions) whose edge lengths ‘ may ultimately be shrunk to zero to recover a continuum theory. Curvature is concentrated in the deficit angles at the vertices.Dynamical Triangulation uses only equilateral triangles, but incorporates both positive and negative curvature by suitably gluing the triangles together. Causal Dynamical Triangulations uses only dynamical triangulations with a sliced or foliated structure with the time-like edges glued in the same direction


    Local light cones match to enforce global causality and preclude closed time-like curves. Regge calculus and the Gauss–Bonnet theorem dramatically simplifies the Wick-rotated Euclidean action from




    where Nis the number of triangles. Assuming periodic boundary conditions in space and time, so the model universe is toroidal with genus G =1 the Euler characteristic X =2 (G-1) and the action


    where the rescaled cosmological constant


    The integral in amplitude becomes the sums


    where the number of triangles is twice the number of vertices, N = 2Nv,

    The Simulation

    Monte Carlo analysis

    To analyse the statistical physics system defined by the  effective partition function, take a random walk through triangulation space sampling different geometries. The 1-2 move and its inverse 2-1 are ergodic triangulation rearrangements in 1+1 dimensions (these Pachner moves are 2-2 and 1-3 in 2d).They are special cases of the Alexander moves of combinatorial topology. The move M  splits a vertex into two at the same time slice and adds two triangles, while its inverse M_1 fuses two vertices at the same time slice and deletes two triangles. Both preserve the foliation of the spacetime. Monte Carlo sampling of the triangulations leads to a detailed balance in which moves equilibrate inverse moves. At equilibrium, the probability to be in a specific labelled triangulation of Nv+1 vertices times satisfies


    After choosing to make a move, randomly select a vertex from Nv vertices, and then randomly split one of np  past time-like edges and one of nf future time-like edges. Hence the move transition probability


    After choosing to make an inverse move, fusing the split vertices is the only way to return to the original triangulation. Hence the inverse move transition probability


    By the effective partition function, the probability of a labelled triangulation with Nv vertices is the weighted Boltzmann factor


    The probability of a move is a non-unique choice. In my simulations, I choose


    To satisfy the detailed balance, the probability of an inverse move is therefore


    Other choices may improve or worsen computational efficiency]. In practice, a Metropolis algorithm  accepts the rearrangements if these probabilities are greater than a uniformly distributed pseudo-random number between 0 and 1. For example, if P[M]=0.01, then the move is unlikely to be accepted.

    In the initial spacetime configuration we have the universe before any move or antimove as shown below

    2dcdtfig2-initial set up of universe

    In this simulation, Ionly looked at the 1+1 dimensional universe. In this simulation, the 1+1 dimensional universe is decomposed into 2 dimensional simplices, one with two time-like edges and a space-like edge. In the simulation, the data structure stores this kind of information which turns out

    to be very useful to implement the simulation. The up pointing triangles have their spatial edges on a past time slice and the down pointing triangles have their spatial edges on a future time slice. Those time-like edges are always future pointing. In this simulation, periodic boundary conditions were chosen. What this means is that the final time-slice and the first time-slice are connected so that every vertex of the beginning time-slice are identifed with every vertex on the finnal time-slice. The toroidal topology S1 x S1 was used to achieve the periodic boundary conditions.

    In my simulation, I used a data structure that stores certain information about each triangle in the triangulation. This information is stored in an array. In this array, the following information

    about the triangulation is stored: type of triangle (that is, if it is up pointing or down pointing), the time slice which it is located, the vertices as well as the neighbors of the triangles. Using a labeling scheme I was able to store information about the triangles in a triangulation. Each triangle was given a key starting from 0 to n-1. The type of the triangle (which is either type I or type II) was also stored. The neighbors for the different triangles were also stored. In general, the array structure for any triangle takes the form Tn = [type; time; p1; p2; p3; n1; n2; n3] where p1; p2; p3 are the vertices of the triangle and n1; n2; n3 are the neighbor of vertex 1, neighbor of vertex 2 and neighbor of vertex 3 respectively. This entire structure is composed strictly of integers. The integer assignments are in linewith the idea of reducing the problem to a counting problem. This data structure is then taken advantage of to do the combinatorial moves to split a vertex and add two triangles and antimoves to remove two triangles. The moves are done by randomly picking a vertex and splitting it, adding two triangles in the gap left behind where ever the vertex was split. The antimoves involve randomly picking vertices and deleting the triangles associated with the vertex. This is the same thing as doing a random walk through space-time. When the moves and antimoves are done, the arrays are repeatedly updated for every move and antimove. The universe is then made to evolve by repeatedly doing moves and antimoves rapidly and the size of the universe is measured every time over every 100 such combinatorial moves.


    Initial runs of the simulation give results such as those below:

    2dcdtfig1-quantum fluctuations in universe size

    model universe gif v1





    In a  further post I will be exploring the critical value of the reduced cosmological constant.

    Enhanced by Zemanta

    by David Horgan at January 16, 2014 11:26 PM

    December 29, 2013

    David Horgan (quantumtetrahedron)

    cdt1+1 fig1

    This week I have been studying and working on several things. Firstly I have been reading a long paper on ‘Holomorphic Factorization for a Quantum Tetrahedron’ which I’ll review in my next post. This is quite a sophisticated paper with some quite high level mathematics so I’ve taken my time with it.

    Then I been upgrading my work on the 3d toy model including the presentation of the equilateral tetrahedron and hyperplanes as seen in the posts:



    The other thing I have been working on is 1 +1 dimensional causal dynamical triangulations, starting with two main references quantum gravity ( in addition to Loll, Amborn papers) on a laptop: 1+1 cdt by Normal S Israel and John F Linder and  Two Dimensional Causal Dynamical Triangulation by Normal S Israel. So far I have set up a rough framework including initial conditions, detailed balance for monte carlo analysis and initial graphical work.

    cdt1+1 fig

    cdt1+1 figb


    What I want to be able to do is independently construct a working CDT code and build it up from 2d to 3d and finally 4d. For the 1+1 CDT code I would be aiming to be able to independently construct simulated universes which exhibit quantum fluctuations as shown below:

    cdt1+1 fig1

    by David Horgan at December 29, 2013 11:59 PM

    December 19, 2013

    Vince Knight

    Installing and using Sage just got even easier.

    +Sage Mathematical Software System just moved to git!


    This is awesome news for a variety of reasons. First of all it's great for development (you can take a look at the github repo here: https://github.com/sagemath/sage. There's a good talk about the git workflow for development by +Volker Braun here: http://www.youtube.com/watch?v=0tejiKN5ctY.

    The other great reason why this is awesome is that it just got really easy to use and install Sage.

    Here's a short video demonstrating everything I've done below:

    If you're familiar with git then you know this but if you're not then you can simply open up a terminal on anything *nix (linux/Mac OS) and type the following:

    $ cd ~
    $ git clone git://github.com/sagemath/sage.git 

    This basically goes to the git repository on github and clones it to a folder called sage in your home directory (if you don't have git installed you'll have to do that first).

    Once you've done that you need to 'make' sage:

    $ cd ~/sage
    $ make

    This will take a little while (it goes and gets most of what you need so it's hard to say how long as it depends on your machine) but after that you'll have Sage on your machine. If you're still in the ~/sage directory you can simply type ./sage to start sage.

    You'll want to add sage to your path so that you can use it from any directory. In this video I did this by using a bit of a trick but here's I'll do something simpler: create a symbolic link to the sage file in ~/sage directory and place that symbolic link in your path (in /usr/bin/local). To do that type this:

    $ ln -s ~/sage/sage /usr/local/bin/sage

    Now you can type sage anywhere and you'll get sage up and running.

    What's really great about all this is that if and when updates/development happens you can just git pull to get all up to date changes. Based on the +Sage Mathematical Software System post on G+: here: it looks like you can already play around with the develop branch...


    Of course if you want the easiest way to use Sage then simply grab an account on +The Sagemath Cloud. I gave a talk last week at the Cardiff Django/Python user group about it and  +William Stein was kind enough to drop in and take some questions: http://www.youtube.com/watch?v=OYVLoTL4xt8 (sound quality isn't always great because I move around a fair bit...)

    by Vincent Knight (noreply@blogger.com) at December 19, 2013 11:49 PM

    December 16, 2013

    William Stein

    Holiday Coding the SageMath Cloud

    I love the Holiday break.  I get to work on https://cloud.sagemath.com (SMC) all day again!   Right now I'm working on a multi-data center extension of http://www.gluster.org for storing a large pool of sparse compressed deduplicated ZFS image files that are efficiently replicated between data centers.  Soon SMC projects will all be hosted in this, which will mean that they can very quickly be moved between computers, are available even if all but one data center goes down, and will have ZFS snapshots instead of the current snapshot system.  ZFS snapshots are much better for this application, since you can force them to happen at a point in time, with tags, and also delete them if you want.  A little later I'll even make it so you can do a full download (to your computer) of an SMC project (and all snapshots!) by just downloading the ZFS image file and mounting it yourself. 

    I'm also continuing to work on adding a Google Compute Engine data center; this is the web server parts hosted there right now,    but the real interesting part will be making compute nodes available, since the GCE compute nodes are very fast.   I'll be making 30GB RAM 8-core instances available, so one can start a project there and just get access to that -- for free for to SMC users, despite the official price being $0.829/hour.    I hope this happens soon. 

    by William Stein (noreply@blogger.com) at December 16, 2013 09:27 AM

    December 14, 2013

    David Horgan (quantumtetrahedron)


    In a number of recent posts including:

    I have been studying the ‘time-fixed’ quantum tetrahedron in which a quantum tetrahedron is used to model the evolution of a state a into state b as shown below:

    physical boundary fig 1

    In this toy model the quantum tetrahedron evolves from a flat shape:

    Background independence fig3

    via an equilateral quantum tetrahedron

    Background independence fig1

    towards a long stretched out quantum tetrahedron in the limit of large T.

    Background independence fig2

    This can be displayed on a phase space diagram,


    Where “Minkowski vacuum states” are the states which minimize the energy.

    Using sagemath I am able to model the quantum tetrahedron during this evolution by varying the lenght of c – which is used to measure the time T variable;


    By using Vpython I am able to model this same system as shown in these animated gifs, the label in the diagram indicated the value of T.

    An animated gif showing the edge a




    An animated gif showing a different view of the quantum tetrahedron:


    Analyzing the {6j} kernel

    An important object in the modelling of the quantum tetrahedron is the Wigner {6j} symbol. As we saw in the post: Physical boundary state for the quantum tetrahedron by Livine and Speziale The quantum dynamics can be studied as by Ponzano-Regge, by  associating with the tetrahedron the amplitude

    physical boundary equ 4

    Using sagemath I was able to evaluate the value of this in some extremal cases: when b=0 and b=2j.

    Case b=0



    Case b=2j



    In the next post I will be looking at the mathematics behind the Wigner{6j} symbol in more detail.

    by David Horgan at December 14, 2013 12:00 AM

    December 10, 2013

    William Stein

    The Sagemath Cloud: a minute "elevator description"

    The Sagemath Cloud combines open source technology that has come out of cloud computing and mathematical software (e.g., web-based Sage and IPython worksheets) to make online mathematical computation easily accessible. People can collaboratively use mathematical software, author documents, use a full command line terminal, and edit complicated computer programs, all using a standard web browser with no special plugins. The core design goals of the site are collaboration and very high reliability, with data mirrored between multiple data centers. The current dedicated infrastructure should handle over a thousand simultaneous active users, and the plan is to scale up to tens of thousands of users as demand grows (about 100 users sign up each day right now). Most open source mathematical software is pre-installed, and users can also install their own copies of proprietary software, if necessary. There are currently around 1000 users on the site each day from all over the world.

    The Sagemath Cloud is under very active development, and there is an ongoing commercialization effort through University of Washington, motivated by many users who have requested more compute power, disk space, or the option to host their own install of the site. Also, though the main focus is on mathematics, the website has also been useful to people in technical areas outside mathematics that involve computation.

    by William Stein (noreply@blogger.com) at December 10, 2013 02:24 PM

    The Matroid Union

    Matroid Computation with Sage: Linear Extensions

    I’ll continue my exploration of Sage’s matroid capabilities (see also here and here). This time I’ll look at extensions within subclasses of matroids, namely representable matroids. Before we get started, let me remind you of the reference manual, in which the capabilities of Sage’s matroid package are extensively documented.

    Once again, the code examples can be run, as well as edited. The cells are linked together, so execute them in order.

    Defining linear matroids

    A linear matroid is easily defined from a representation matrix:

    Ok, so this matroid is over the field $\mathbb{Q}$. What if I want a different field? There are two places to specify that, either in the Matroid() or in the Matrix() function:

    Even more concise, we can squeeze the matrix definition into the matroid definition. The second line prints the representation, along with column labels.

    Linear extensions

    Like last time, let’s start by generating all linear extensions:

    Note that, different from last time, we immediately get a list of matroids, instead of a generator. That might change in the future, though! Let’s examine our list of extensions a little.

    There are a few options built into the method to restrict the output. First, we can request only simple extensions:

    Second, we can request only extensions that are spanned by a specific subset $F$:

    Our third option deserves its own heading.

    Partial fields

    A while a go, I started writing about partial fields. Sage has support for them built in, with some caveats: fields of fractions are very unreliable (because of hashing reasons). This will be fixed in some future release of Sage, see this bug report. We will use a partial field that’s safe, the dyadic partial field. For a number of others, you can use the product ring code I listed here (with a more advanced version, supporting loading and saving, in a forthcoming technical report that’s currently on hold on arXiv.org).

    The key to the world of partial fields in sage is the set of fundamental elements: given a partial field $\mathbb{P} = (R, G)$, the set of fundamental elements is $\{1\} \cup \{x \in G : 1 – x \in G\}$. These can be used to generate $\mathbb{P}$-matrices (I’ll talk about the theory some other time) as follows:

    We can test whether each of these is a valid dyadic matroid, as follows:

    Other formats

    Sometimes you may not want to see the matroids, but rather just the new vectors that define the extensions. Sage can give you those!

    Read this as follows: if the chain is $\{a: 1, b: 2, d:1\}$, the new column is obtained by adding $1$ times the column labeled by $a$, $2$ times the column labeled by $b$, and $1$ times the column labeled by $d$. The advantage of this representation, rather than just a vector, is that it is independent of row operations on the representation matrix.

    To convert a chain, or just a column vector, into an extension, use the linear_extension method:

    Generating all matroids

    To get all linear matroid over a specific field or partial field, we can easily modify the code from the last post. Two caveats:

    • We replace is_isomorphic by is_field_isomorphic. That way we get one representative of each representation of the matroid. Of course, this isn’t an issue until $\text{GF}(4)$. To get non isomorphic matroids, you’ll have to post-process the resulting list.
    • The method we used for canonical ordering is not available for linear matroids. Instead, we use a method called _weak_partition, and check if the new element is in the last part.

    And this prints the counts (in a slightly different form from last time — each line lists the number matroids of a fixed size, broken down by rank):


    • Modify the above to generate all dyadic matroids.
    • Find out the dual version of all methods discussed above. Try to understand what a cochain is.

    by Stefan van Zwam at December 10, 2013 04:26 AM

    December 01, 2013

    David Horgan (quantumtetrahedron)

    quantum tetrahedron concepts

    It is known that the large-volume limit of a quantum tetrahedron is a quantum  harmonic oscillator. In particular the volume operator of a quantum tetrahedron is, in the sector of large eigenvalues, accurately described by a quantum harmonic oscillator.

    Using  Vpython, it is possible to visualise the quantum tetrahedron as a  semi classical quantum simple harmonic oscillator.  The implementation of this is shown below;

    Quantum simple harmonic oscillator

    The quantum tetrahedron as a  Quantum  harmonic oscillator .

    quantum tetrahedron as quantum SHM

    Click to view animated gif

    Review of the basic mathematical structure of a quantum tetrahedron

    A quantum tetrahedron consists of four angular momenta Ji, where i = {1,2,3, 4} representing its faces and coupling to a vanishing total angular momentum – this means that the Hilbert space consists of all states ful filing:

    j1 +j2 +j3 +j4 = 0

    In the coupling scheme both pairs j1;j2 and j3;j4 couple frst to two irreducible SU(2)representations of dimension 2k+1 each, which are then added to give a singlet state.

    The quantum number k ranges from kmin to kmax in integer steps with:
    kmin = max(|j1-j2|,|j3 -j4|)
    kmax = min(j1+j2,j3 +j4)

    The total dimension of the Hilbert space is given by:
    d = kmax -kmin + 1.

    The volume operator can be formulated as:

    large vol equ 1

    where the operators

    large vol equ 2

    represent the faces of the tetrahedron with

    large vol equ 3and being the Immirzi parameter.

    It is useful to use the operator

    large vol equ 4

    which, in the basis of the states can be represented as

    large vol equ 5


    large vol equ 6

    and where

    large vol equ 7

    is the area of a triangle with edges a; b; c expressed via Herons formula,

    These choices mean that the matrix representation of Q is real and antisymmetric and  the spectrum of Q consists for even d of pairs of eigenvalues q and -q differing in sign. This makes it much easier to find the eigenvalues as I found in  Numerical work with sage 7.

    The large volume limt of Q is found to be given by:

    large vol equ 8

    The eigenstates of Q are just the well-known wave functions diagonalizing the harmonic oscillator in real-space representation:

    large vol equ 9

    For a monochromatic quantum tetrahedron we can obtain closed analytical expressions:


    Below is shown a sagemath program which displays the wavefunction  of a Quantum simple harmonic oscillator using Hermite polynomials Hn:


    This an really important step in the development of Loop Quantum Gravity, because now we have  a  quantum field theory of geometry. This leads from combinatorial or simplicial description of geometry, through spin foams and the  quantum tetrahedron, to many particle states of quantum tetrahedra and the emergence of spacetime as a Bose-Einstein condensate.

    quantum tetrahedron concepts

    by David Horgan at December 01, 2013 09:25 PM

    Vince Knight

    How to handle float error for plots near discontinuities in Sage

    Last week I read this blog post by +Patrick Honner. In the post +Patrick Honner plots a graph of a function with a removable discontinuity on Desmos and when zooming in enough he got some errors.

    I was waiting around to start this (ridiculously fun) Hangout on Air with a bunch of mathematicians hosted by +Amy Robinson of +Science on Google+:

    While waiting I rushed to write this blog post claiming that if you did the same thing with +Sage Mathematical Software System you did not get any errors. It was quickly pointed out to me on twitter and in the comments that I just had not zoomed in enough.

    I edited the blog post to first of all change the title (it was originally 'When Sage doesn't fail' but now reads 'When Sage also fails') and also to include some code that shows that the exact same error appears.

    On G+, +Robert Jacobson (who's the owner of the Mathematics community which you should check out if you haven't already) pointed out that you could surely use Sage's exact number fields to avoid this error.

    He put together some code and shared it with me on +The Sagemath Cloud that does exactly this. Here's a slight tweak of the code Robert wrote (hopefully you haven't changed your mind and still don't mind if I blog this Robert!):

    f(x) = (x + 2) / (x ^ 2 + 3 * x + 2) # Define the function
    discontinuity = -1 # The above function has two discontinuities, this one I don't want to plot
    hole = -2 # The hole described by Patrick Honner

    def make_list_for_plot(f, use_floats=False, zoom_level=10^7, points=1001):
    count = 0 # Adding this to count how many tries fail
    z = zoom_level
    xmin = hole - 10/z # Setting lower bound for plot
    xmax = min(hole + 10/z, discontinuity - 1/10) # Setting upper bound for plot only up until the second (messy) discontinuity
    x_vals = srange(start=xmin, end=xmax, step=(xmax-xmin)/(points-1), universe=QQ, check=True, include_endpoint=True)

    # If we are using floating point arithmetic, cast all QQ numbers to floating point numbers using the n() function.
    if use_floats:
    x_vals = map(n, x_vals)

    lst = []
    for x in x_vals:
    if x != hole and x != discontinuity: # Robert originally had a try/except statement here to pick up ANY discontinuities. This is not as good but I thought was a bit fairer...
    y = f(x)
    lst.append((x, y))

    return lst

    The code above makes sure we stay away from the discontinuity but also allows us to swap over to floating point arithmetic to see the effect. The following plots the functions using exact arithmetic:

    exact_arithmetic = make_list_for_plot(f)

    p = list_plot(exact_arithmetic, plotjoined=True) # Plot f
    p += point([hole, -1], color='red', size=30) # Add a point

    We see the plot here (with no errors):

    To call the plots with floating point arithmetic:

    float_arithmetic = make_list_for_plot(f, use_floats=True)

    p = list_plot(float_arithmetic, plotjoined=True) # Plot f
    p += point([hole, -1], color='red', size=30) # Add a point

    We see that we now get the numerical error:

    Just to confirm here is the same two plots with an even higher zoom:

    To change the zoom, try out the code in the sage cell linked here: simply change the zoom_level which was set to $10^12$ for the last two plots.

    (Going any higher than $10^14$ seems to bring in another error that does not get picked up by my if statement in my function definition: Robert originally had a try except method but I thought that in a way this was a 'fairer' way of doing things. Ultimately though it's very possible and easy to get an error-less plot.)

    by Vincent Knight (noreply@blogger.com) at December 01, 2013 07:57 AM

    November 23, 2013

    Vince Knight

    When Sage also fails

    +Patrick Honner wrote a post titled: 'When Desmos Fails' which you should go read. In it he shows a quirk about Desmos (a free online graphing calculator) that seems to not be able to correctly graph around the removable discontinuity  $(-2,-1)$ of the following function:


    People who understand this better than me say it might have something to how javascript handles floats...

    Anyway I thought I'd see how +Sage Mathematical Software System could handle this. Here's a Sage cell with an interact that allows you to zoom in on the point (click on evaluate and it should run, it doesn't seem to fit too nice embedded in my blog so here's a link to a standalone Sage cell: http://goo.gl/WtezZ4):

    It looks like Sage doesn't have the same issues as Desmos does. This is probably not a fair comparison, and needed a bit more work than Desmos (which I can't say I've used a lot) to get running but I thought it was worth taking a look at :)

    EDIT: IF you Zoom in more you do get the same behaviour as Desmos! I thought I had zoomed in to the same level as +Patrick Honner did but perhaps I misjudged from his picture :)

    Here's the same thing in Sage (when setting $z=10^7$ in the above code):

    by Vincent Knight (noreply@blogger.com) at November 23, 2013 01:55 AM

    November 16, 2013

    Vince Knight

    Plotting complex numbers in Sage

    I had a student email me overnight asking how to plot complex numbers in +Sage Mathematical Software System.

    I spent a couple of minutes googleing and found various command that would plot complex functions:

    f = sqrt(x) + 1 / x complex_plot(sqrt, (-5,5), (-5, 5))

    This gives the following plot:

    This was however not what my student was asking. They wanted to know how to plot a given set of points in the complex plain (referred to as the Argand plane). A quick google to check if there was anything in Sage pre built for this brought me to this published sheet by +Jason Grout.

    I tweaked it slightly so that it was in line with the commands my students have learnt so far and also to include axes legends and put the following in to a function:

    def complex_point_plot(pts): 
        A function that returns a plot of a list of complex points. 
        Arguments: pts (a list of complex numbers) 
        Outputs: A list plot of the imaginary numbers 
        return list_plot([(real(i), imag(i)) for i in pts], axes_labels = ['Re($z$)', 'Im($z$)'], size=30)

    This function simply returns a plot as required. Here is a small test with the output:

    complex_point_plot([3*I, e^(I*pi), e^(I*3*pi/4), 4-4*I])

    Here is some code that will plot the unit circle using the $z=e^{i\theta}$ notation for complex numbers (and the Sage srange command):

    pts = [e^(I*(theta)) for theta in srange(0, 2*pi, .1)] 

    Here is the output:

    I published all this in a worksheet on our server so it's now immediately available to all our students. I'm really enjoying teaching +Sage Mathematical Software System to our students.

    A Sage cell with the above code (that you can run in your browser) can be found here: http://goo.gl/jipzxV

    EDIT: Since posting this +Punarbasu Purkayastha pointed out on G+ that list_plot can handle complex points right out of the box :) So the above can just be obtained by typing:

    pts = [e^(I*(theta)) for theta in srange(0, 2*pi, .1)] 
    list_plot(pts, axes_labels=['Re($z$)','Im($z$)'], size=30)

    Learn something new everyday...

    by Vincent Knight (noreply@blogger.com) at November 16, 2013 08:07 AM