April 15, 2014

William Stein

SageMathCloud's new storage architecture

Keywords: ZFS, bup, rsync, Sage

SageMathCloud (SMC) is a browser-based hosted cloud computing environment for easily collaborating on Python programs, IPython notebooks, Sage worksheets and LaTeX documents. I spent the last four months wishing very much that less people would use SMC. Today that has changed, and this post explains some of the reasons why.

Consistency Versus Availability

Consistency and availability are competing requirements. It is trivial to keep the files in a SageMathCloud project consistent if we store it in exactly one place; however, when the machine that project is on goes down for any reason, the project stops working, and the users of the project are very unhappy. By making many copies of the files in a project, it's fairly easy to ensure that the project is always available, even if network switches in multiple data centers completely fail, etc. Unfortunately, if there are too many users and the synchronization itself puts too heavy of a load on the overall system, then machines will fail more frequently, and though projects are available, files do not stay consistent and data is lost to the user (though still "out there" somewhere for me to find).

Horizontal scalability of file storage and availability of files are also competing requirements. If there are a few compute machines in one place, then they can all mount user files from one central file server. Unfortunately, this approach leads to horrible performance if instead the network is slow and has high latency; it also doesn't scale up to potentially millions of users. A benchmark I care about is downloading a Sage binary (630MB) and extracting it (creating over 70,000 files); I want this to take at most 3 minutes total, which is hard using a networked filesystem served over the general Internet between data centers. Instead, in SMC, we store the files for user projects on the compute machines themselves, which provides optimal speed. Moreover, we use a compressed filesystem, so in many cases read and write speeds are nearly twice as fast as they might be otherwise.

New Architecture of SageMathCloud

An SMC project with id project_id consists of two directories of files, replicated across several machines using rsync:
  1. The HOME directory: /projects/project_id
  2. A bup repository: /bup/bups/project_id
Users can also create files they don't care too much about in /scratch, which is a compressed and deduplicated ZFS filesystem. It is not backed up in any way, and is local to that compute.

The /projects directory is one single big ZFS filesystem, which is both lz4 compressed and deduplicated. ZFS compression is just plain awesome. ZFS deduplication is much more subtle, as deduplication is tricky to do right. Since data can be deleted at any time, one can't just use a bloom filter to very efficiently tell whether data is already known to the filesystem, and instead ZFS uses a much less memory efficient data structure. Nonetheless, deduplication works well in our situation, since the compute machines all have sufficient RAM (around 30-60GB), and the total data stored in /projects is well under 1TB. In fact, right now most compute machines have about 100GB stored in /projects.
The /bup/bups directory is also one single big ZFS filesystem; however, it is neither compressed nor deduplicated. It contains bup repositories, where bup is an awesome git-based backup tool written in Python that is designed for storing snapshots of potentially large collections of arbitrary files in a compressed and highly deduplicated way. Since the git pack format is already compressed and deduplicated, and bup itself is highly efficient at deduplication, we would gain almost nothing by using compression or deduplication directly on this ZFS filesystem. When bup deduplicates data, it does so using a sliding window through the file, unlike ZFS which simply breaks the file up into blocks, so bup does a much better job at deduplication. Right now, most compute machines have about 50GB stored in /bup/bups.

When somebody actively uses a project, the "important" working files are snapshotted about once every two minutes. These snapshots are done using bup and stored in /bup/bups/project_id, as mentioned above. After a snapshot is successfully created, the files in the working directory and in the bup repository are copied via rsync to each replica node. The users of the project do not have direct access to /bup/bups/project_id, since it is of vital importance that these snapshots cannot be corrupted or deleted, e.g., if you are sharing a project with a fat fingered colleague, you want peace of mind that even if they mess up all your files, you can easily get them back. However, all snapshots are mounted at /projects/project_id/.snapshots and browseable by the user; this uses bup's FUSE filesystem support, enhanced with some patches I wrote to support file permissions, sizes, change times, etc. Incidentally, the bup snapshots have no impact on the user's disk quota.

We also backup all of the bup archives (and the database nodes) to a single large bup archive, which we regularly backup offsite on encrypted USB drives. Right now, with nearly 50,000 projects, the total size of this large bup archive is under 250GB (!), and we can use it efficiently recover any particular version of any file in any project. The size is relatively small due to the excellent deduplication and compression that bup provides.

In addition to the bup snapshots, we also create periodic snapshots of the two ZFS filesystems mentioned above... just in case. Old snapshots are regularly deleted. These are accessible to users if they search around enough with the command line, but are not consistent between different hosts of the project, hence using them is not encouraged. This ensures that even if the whole replication/bup system were to somehow mess up a project, I can still recover everything exactly as it was before the problem happened; so far there haven't been any reports of problems.

Capacity

Right now there are about 6000 unique weekly users of SageMathCloud and often about 300-400 simultaneous users, and there are nearly 50,000 distinct projects. Our machines are at about 20% disk space capacity, and most of them can easily be expanded by a factor of 10 (from 1TB to 12TB). Similarly, disk space for our Google compute engine nodes is $0.04 GB / month. So space-wise we could scale up by a factor of 100 without too much trouble. The CPU load is at about 10% as I write this, during a busy afternoon with 363 clients connected very actively modifying 89 projects. The architecture that we have built could scale up to a million users, if only they would come our way...

by William Stein (noreply@blogger.com) at April 15, 2014 04:02 PM

April 11, 2014

Sébastien Labbé

My status report at Sage Days 57 (RecursivelyEnumeratedSet)

At Sage Days 57, I worked on the trac ticket #6637: standardize the interface to TransitiveIdeal and friends. My patch proposes to replace TransitiveIdeal and SearchForest by a new class called RecursivelyEnumeratedSet that would handle every case.

A set S is called recursively enumerable if there is an algorithm that enumerates the members of S. We consider here the recursively enumerated set that are described by some seeds and a successor function succ. The successor function may have some structure (symmetric, graded, forest) or not. Many kinds of iterators are provided: depth first search, breadth first search or elements of given depth.

TransitiveIdeal and TransitiveIdealGraded

Consider the permutations of \(\{1,2,3\}\) and the poset generated by the method permutohedron_succ:

sage: P = Permutations(3)
sage: d = {p:p.permutohedron_succ() for p in P}
sage: S = Poset(d)
sage: S.plot()
/~labbe/Files/2014/poset_123.png

The TransitiveIdeal allows to generates all permutations from the identity permutation using the method permutohedron_succ as successor function:

sage: succ = attrcall("permutohedron_succ")
sage: seed = [Permutation([1,2,3])]
sage: T = TransitiveIdeal(succ, seed)
sage: list(T)
[[1, 2, 3], [2, 1, 3], [1, 3, 2], [2, 3, 1], [3, 2, 1], [3, 1, 2]]

Remark that the previous ordering is neither breadth first neither depth first. It is a naive search because it stores the element to process in a set instead of a queue or a stack.

Note that the method permutohedron_succ produces a graded poset. Therefore, one may use the TransitiveIdealGraded class instead:

sage: T = TransitiveIdealGraded(succ, seed)
sage: list(T)
[[1, 2, 3], [2, 1, 3], [1, 3, 2], [2, 3, 1], [3, 1, 2], [3, 2, 1]]

For TransitiveIdealGraded, the enumeration is breadth first search. Althougth, if you look at the code (version Sage 6.1.1 or earlier), we see that this iterator do not make use of the graded hypothesis at all because the known set remembers every generated elements:

current_level = self._generators
known = set(current_level)
depth = 0
while len(current_level) > 0 and depth <= self._max_depth:
    next_level = set()
    for x in current_level:
        yield x
        for y in self._succ(x):
            if y == None or y in known:
                continue
            next_level.add(y)
            known.add(y)
    current_level = next_level
    depth += 1
return

Timings for TransitiveIdeal

sage: succ = attrcall("permutohedron_succ")
sage: seed = [Permutation([1..5])]
sage: T = TransitiveIdeal(succ, seed)
sage: %time L = list(T)
CPU times: user 26.6 ms, sys: 1.57 ms, total: 28.2 ms
Wall time: 28.5 ms
sage: seed = [Permutation([1..8])]
sage: T = TransitiveIdeal(succ, seed)
sage: %time L = list(T)
CPU times: user 14.4 s, sys: 141 ms, total: 14.5 s
Wall time: 14.8 s

Timings for TransitiveIdealGraded

sage: seed = [Permutation([1..5])]
sage: T = TransitiveIdealGraded(succ, seed)
sage: %time L = list(T)
CPU times: user 25.3 ms, sys: 1.04 ms, total: 26.4 ms
Wall time: 27.4 ms
sage: seed = [Permutation([1..8])]
sage: T = TransitiveIdealGraded(succ, seed)
sage: %time L = list(T)
CPU times: user 14.5 s, sys: 85.8 ms, total: 14.5 s
Wall time: 14.7 s

In conlusion, use TransitiveIdeal for naive search algorithm and use TransitiveIdealGraded for breadth search algorithm. Both class do not use the graded hypothesis.

Recursively enumerated set with a graded structure

The new class RecursivelyEnumeratedSet provides all iterators for each case. The example below are for the graded case.

Depth first search iterator:

sage: succ = attrcall("permutohedron_succ")
sage: seed = [Permutation([1..5])]
sage: R = RecursivelyEnumeratedSet(seed, succ, structure='graded')
sage: it_depth = R.depth_first_search_iterator()
sage: [next(it_depth) for _ in range(5)]
[[1, 2, 3, 4, 5],
 [1, 2, 3, 5, 4],
 [1, 2, 5, 3, 4],
 [1, 2, 5, 4, 3],
 [1, 5, 2, 4, 3]]

Breadth first search iterator:

sage: it_breadth = R.breadth_first_search_iterator()
sage: [next(it_breadth) for _ in range(5)]
[[1, 2, 3, 4, 5],
 [1, 3, 2, 4, 5],
 [1, 2, 4, 3, 5],
 [2, 1, 3, 4, 5],
 [1, 2, 3, 5, 4]]

Elements of given depth iterator:

sage: list(R.elements_of_depth_iterator(9))
[[5, 4, 2, 3, 1], [4, 5, 3, 2, 1], [5, 3, 4, 2, 1], [5, 4, 3, 1, 2]]
sage: list(R.elements_of_depth_iterator(10))
[[5, 4, 3, 2, 1]]

Levels (a level is a set of elements of the same depth):

sage: R.level(0)
[[1, 2, 3, 4, 5]]
sage: R.level(1)
{[1, 2, 3, 5, 4], [1, 2, 4, 3, 5], [1, 3, 2, 4, 5], [2, 1, 3, 4, 5]}
sage: R.level(2)
{[1, 2, 4, 5, 3],
 [1, 2, 5, 3, 4],
 [1, 3, 2, 5, 4],
 [1, 3, 4, 2, 5],
 [1, 4, 2, 3, 5],
 [2, 1, 3, 5, 4],
 [2, 1, 4, 3, 5],
 [2, 3, 1, 4, 5],
 [3, 1, 2, 4, 5]}
sage: R.level(3)
{[1, 2, 5, 4, 3],
 [1, 3, 4, 5, 2],
 [1, 3, 5, 2, 4],
 [1, 4, 2, 5, 3],
 [1, 4, 3, 2, 5],
 [1, 5, 2, 3, 4],
 [2, 1, 4, 5, 3],
 [2, 1, 5, 3, 4],
 [2, 3, 1, 5, 4],
 [2, 3, 4, 1, 5],
 [2, 4, 1, 3, 5],
 [3, 1, 2, 5, 4],
 [3, 1, 4, 2, 5],
 [3, 2, 1, 4, 5],
 [4, 1, 2, 3, 5]}
sage: R.level(9)
{[4, 5, 3, 2, 1], [5, 3, 4, 2, 1], [5, 4, 2, 3, 1], [5, 4, 3, 1, 2]}
sage: R.level(10)
{[5, 4, 3, 2, 1]}

Recursively enumerated set with a symmetric structure

We construct a recursively enumerated set with symmetric structure and depth first search for default enumeration algorithm:

sage: succ = lambda a: [(a[0]-1,a[1]), (a[0],a[1]-1), (a[0]+1,a[1]), (a[0],a[1]+1)]
sage: seeds = [(0,0)]
sage: C = RecursivelyEnumeratedSet(seeds, succ, structure='symmetric', algorithm='depth')
sage: C
A recursively enumerated set with a symmetric structure (depth first search)

In this case, depth first search is the default algorithm for iteration:

sage: it_depth = iter(C)
sage: [next(it_depth) for _ in range(10)]
[(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6), (0, 7), (0, 8), (0, 9)]

Breadth first search. This algorithm makes use of the symmetric structure and remembers only the last two levels:

sage: it_breadth = C.breadth_first_search_iterator()
sage: [next(it_breadth) for _ in range(10)]
[(0, 0), (0, 1), (0, -1), (1, 0), (-1, 0), (-1, 1), (-2, 0), (0, 2), (2, 0), (-1, -1)]

Levels (elements of given depth):

sage: sorted(C.level(0))
[(0, 0)]
sage: sorted(C.level(1))
[(-1, 0), (0, -1), (0, 1), (1, 0)]
sage: sorted(C.level(2))
[(-2, 0), (-1, -1), (-1, 1), (0, -2), (0, 2), (1, -1), (1, 1), (2, 0)]

Timings for RecursivelyEnumeratedSet

We get same timings as for TransitiveIdeal but it uses less memory so it might be able to enumerate bigger sets:

sage: succ = attrcall("permutohedron_succ")
sage: seed = [Permutation([1..5])]
sage: R = RecursivelyEnumeratedSet(seed, succ, structure='graded')
sage: %time L = list(R)
CPU times: user 24.7 ms, sys: 1.33 ms, total: 26.1 ms
Wall time: 26.4 ms
sage: seed = [Permutation([1..8])]
sage: R = RecursivelyEnumeratedSet(seed, succ, structure='graded')
sage: %time L = list(R)
CPU times: user 14.5 s, sys: 70.2 ms, total: 14.5 s
Wall time: 14.6 s

by Sébastien Labbé at April 11, 2014 05:15 PM

April 04, 2014

Vince Knight

A list of stuff for my student to look at before getting down to some Sage development

+James Campbell will be working with me this Summer on a project aimed at developing game theoretical stuff in to +Sage Mathematical Software System. James just emailed me asking for a list of stuff he could/should read up on before he starts. I thought more knowledgeable people than me might be able to contribute so I've lazily copied my email to him here: 

------------ email ------------

Code:

- git
- sage development (tickets, documentation etc... This is something I don't know much about myself so read about it on the Safe site and watch videos on youtube there are a bunch of them)
- cython (http://cython.org/ - watch this intro to Sage lecture by +William Steinhttps://www.youtube.com/watch?v=fI4NlMfGHC0 that's the first lecture in a class he's currently giving you also could watch the rest)
- C (to help with cython - you don't necessarily need to be an expert I think)
Test driven development: (watch all this and you will know what I mean: https://www.youtube.com/playlist?list=PL5859017B018F03F4)
- ssh and *nix (so that you're comfortable to jump on to one of my machines if necessary - depending on time we might also get you to tweak the Cardiff server)
- matplotlib (the python library that Sage uses to plot stuff, good to know it from a python pov so as to be able to get Sage to make it do what we want - we might or might not use this)
- How Sage plots graphs (graph theory graphs like I used for this: http://goo.gl/KHGYk7 - we might or might not need this)

Game Theory:


We'll talk about this but 1 of the above (easy to code: 30 minutes of work) will be a gentle appetiser to the 'piece de resistance': normal form games,

- Normal form games (first 6 chapters of http://www.vincent-knight.com/teaching/gametheory/)
- The lrs algorithm (there is an implementation of this written in c that we either want to re-write to get working in Sage so you'll need to understand it or get Sage to talk to it / use it, I know Sage kind of has this as an optional library but I'm not entirely sure how to 'get at it' http://cgm.cs.mcgill.ca/~avis/C/lrs.html)
- Polytopes, you want to be comfortable-ish with the vocabulary around polytopes to be able to understand the lrs algorithm a bit. 

Books:

- In general I'd say don't spend much money on Python books. Like most OSS stuff there's an awesome amount of stuff online. Take a look at: http://pythonbooks.revolunet.com/ (a list of free Python books). There are various exceptions to this rule though.

- With regards to Sage I don't think you need a book for this project (as it's about building stuff for Sage so mainly reading about the development process and watching youtube videos is the way to go), I have a kindle copy of http://goo.gl/q9s9da, it's not bad but really everything is online. If you haven't already take a look at http://sagemath.org/help.html and join the relevant discussion groups on there.

- With regards to Game Theory there's a reading list on my website (all that follow are linked to on there). Webb's book is a gentle introduction, Algorithmic Game Theory is great and more about what you will be looking at. Finally there's a newish book by Maschler which looks really nice but I've not had time to work through it yet. In general my course site should suffice (reading/working through those books could take you years) with regards to what you need to know for game theory and I am certainly not asking you to spend money on a book. If there's a book (GT or code) that you really think would be useful let's talk.

------------

James is a pretty good coder with a good knowledge of a git based workflow already as his first year project (during which he actually learnt to code) has led his group to develop: http://thefightclub.herokuapp.com/ which is also on github (if you've made your way this far, please click on that and waste a minute or 2 of your life).

If there's anything missing from this list please add it to the comments :)

I'm looking forward to this project.

by Vincent Knight (noreply@blogger.com) at April 04, 2014 11:57 AM

March 27, 2014

Vince Knight

Scheduling group presentations using Graph Theory and Sage.

Yesterday (2014-03-26) I spoke at the Embedded Enterprise Exchange about my new first year module. I used a Hangout on air during the talk and you can see it here:

That's not what I'm going to talk about here though. The course I talked baout ends with all the 'companies' (groups of 4 students) giving a $\approx 25$ minute talk.

I need to schedule 40ish talks and this needs to fit around the student availability as well as my own. In this post I'll describe how I did this using a combination of Doodle, +Sage Mathematical Software System and Graph Theory.

The beginning of this post will be some of the boring details but towards the end I start talking about the mathematics (so feel free to skip to there...).

First of all I needed to know my students availability.

For this I simply used Doodle: https://doodle.com. I kind of use Doodle for every meeting I have to schedule (they also offer a cool tool that lets you show your availability so students/colleagues have an indication of when I might be able to meet with them).

Here's a screenshot of the responses:


You can't really see anything there as I had to zoom out a lot to grab the whole picture. Doodle allows you to download the information for any given poll in .xls format so I could relatively easily obtain the biadjacency matrix $M$ for my problem. Where $M_{ij}$ is 1 if group $i$ is available for schedule slot $j$ and 0 otherwise.


The mathematics and code needed.

Once I've got a .csv file (by tweaking the .xls file) of the biadjacency matrix I import that in to +Sage Mathematical Software System  and convert it to an instance of the `Matrix` class using the following:

import csv 
f=open(DATA+'availabilitymatrix','r') 
data = [[int(j) for j in row] for row in csv.reader(f)] 
f.close() 
M = Matrix(data)

I then need to remove any particular scheduled slots that are not picked by any company:

M = matrix([row for row in M.transpose() if max(row) != 0]).transpose()

Once I've done this I can define the bi-partite graph (bi-partite simply means that the vertices can be separated in to 2 non adjacent collections):

g = BipartiteGraph(M)

We can then get a get a picture of this, I do this using a 'partition' (a graph colouring) that will colour the groups (red) and the schedule slots (blue):

g = BipartiteGraph(M)
p = g.coloring()
g.show(layout='circular',dist=1,vertex_size=250, graph_border=True, figsize=[15,15],partition=p)

The various options I pass to the `show` command are simply to get the circular arrangement (and other minor things):



The above looks quite messy and what I essentially want is get as many pairwise matchings between the blue vertices (slots) and red vertices (companies) so that each schedule slot is attributed at most 1 company and every company has at least 1 schedule slot.

On any given graph $G=(V,E)$ this problem is known as looking for a maximal matching and can be written down mathematically:

Max:  $\sum_{e \in E(G)} m_e$
Such that:  $\forall v$ $\sum_{e \in E(G) \atop v \sim e} m_e \leq 1$

We are in essence finding a subset of edges of our original graph in such a way as to maximise the number of edges such that no vertex has more than 1 edge.

This is all explained extremely well at the +Sage Mathematical Software System documentation pages here.

Furthermore at the documentation the code needed to solve the problem is also given:

p = MixedIntegerLinearProgram()  
matching = p.new_variable(binary=True)
p.set_objective(sum(matching[e] for e in g.edges(labels=False)))
for v in g:
      p.add_constraint(sum(matching[e]
          for e in g.edges_incident(v, labels=False)) <= 1)
p.solve()

When I run the above, `p` is now a solved Mixed Integer Linear Program (corresponding to the matching problem described). To obtain the solution:

matching = p.get_values(matching)
schedule = [e for e,b in matching.iteritems() if b == 1]

Calling `schedule` gives a set of edges (denoted by the corresponding vertex numbers):

[(5, 57), (0, 45), (23, 50), (4, 42), (38, 60), (26, 56), (34, 62), (16,
68), (1, 43), (7, 40), (9, 44), (36, 58), (12, 49), (35, 71), (28, 66),
(25, 47), (24, 53), (6, 46), (3, 64), (39, 67), (17, 69), (22, 55), (13,
48), (33, 41), (10, 63), (21, 61), (30, 52), (29, 65), (37, 70), (15,
54), (19, 51), (11, 59)]

It is then really easy to draw another graph:

B=Graph(schedule)
p = B.coloring()
B.show(layout='circular',dist=1,vertex_size=250, graph_border=True, figsize=[15,15],partition=p)

Which gives:



You can see that the obtained graph has all the required properties and most importantly is a lot less congested.

Some details.

I'm leaving out some details, for example I kept track of the names of the companies and also the slots so that the final output of all this looked like this:

4InARow: Fri1100
Abacus: Tue0930
Alpha1: Thu1130
Alpha2: Mon0930
AusfallLtd: Fri1230
AxiomEnterprise: Thu1000
Batduck: Thu1430
CharliesAngles: Thu1500
CwmniRhifau: Mon1330
EasyasPi: Fri1130
Evolve: Thu1300
HSPLLtd.: Mon1030
JECT: Tue1200
JJSL: Thu1030
JNTL: Mon1400
JennyCash: Tue1630
KADE: Fri1330
MIAS: Fri1300
MIPE: Thu1100
MLC: Tue1600
Nineties: Mon0900
Promis: Fri1400
R.A.C.H: Tue1530
RYLR: Tue1230
SBTP: Fri1030
Serendipity: Mon1230
UniMath: Tue1300
VectorEnterprises: Tue1330
Venus: Thu1400
codeX: Mon1300
dydx: Wed1630
eduMath: Thu0930

(BatDuck is my favourite company name by far...)

Why did I do this this way?

There are 3 reasons:

1. I shared the schedule with my students through a published Sage sheet on our server. That way they can see a direct applied piece of mathematics and can also understand some of the code if they wanted to.
2. "Point and click doesn't scale" - I'm sure I could have solved this one instance of my problem with pen and paper and some common sense faster than it took me to write the code to solve the problem. The thing is next year when I need to schedule these talks again it will at most take me 3 minutes as the code is all here and ready to go. (Most readers of this blog won't need that explained but if any of my students find their way here: that's an important message for you).
3. It was fun.

by Vincent Knight (noreply@blogger.com) at March 27, 2014 03:35 PM

March 14, 2014

Lee Worden

Sage and WorkingWiki demo: High-level analysis of a population dynamics model

Publish Date: 
Fri, 03/14/2014 - 13:39

State-space diagram produced by the Sage framework

I've been developing a framework in Sage to work with the kind of dynamic models that my collaborators and I use a lot of the time in population biology and social-science projects. Here's a demo of what it can do:

by worden at March 14, 2014 08:55 PM

March 06, 2014

Martin Albrecht

Sage & LMonade GSOC 2014

This year Sage and lmonade are mentoring organizations for the Google Summer of Code again. We have many exciting projects for students to work (from home) over the summer, get mentored by experts in the field and get paid by Google. Sage projects lmonade projects Note that projects are not limited to the ideas presented on […]

by martinralbrecht at March 06, 2014 03:39 PM

Sébastien Labbé

Demo of the IPython notebook at Sage Paris group meeting

Today I am presenting the IPython notebook at the meeting of the Sage Paris group. This post gathers what I prepared.

Installation

First you can install the ipython notebook in Sage as explained in this previous blog post. If everything works, then you run:

sage -ipython notebook

and this will open a browser.

Turn on Sage preparsing

Create a new notebook and type:

In [1]: 3 + 3
6
In [2]: 2 / 3
0
In [3]: matrix
Traceback (most recent call last):
...
NameError: name 'matrix' is not defined

By default, Sage preparsing is turn off and Sage commands are not known. To turn on the Sage preparsing (thanks to a post of Jason on sage-devel):

%load_ext sage.misc.sage_extension

You now get Sage commands working in ipython:

In [4]: 3 + 4
Out[4]: 7
In [5]: 2 / 3
Out[5]: 2/3
In [6]: type(_)
Out[6]: <type 'sage.rings.rational.Rational'>
In [7]: matrix(3, range(9))
Out[7]:
[0 1 2]
[3 4 5]
[6 7 8]

Scroll and hide output

If the output is too big, click on Out to scroll or hide the output:

In [8]: range(1000)

Sage 3d Graphics

3D graphics works but open in a new Jmol window:

In [9]: sphere()

Sage 2d Graphics

Similarly, 2D graphics works but open in a new window:

In [10]: plot(sin(x), (x,0,10))

Inline Matplotlib graphics

To create inline matplotlib graphics, the notebook must be started with this command:

sage -ipython notebook --pylab=inline

Then, a matplotlib plot can be drawn inline (example taken from this notebook):

import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 3*np.pi, 500)
plt.plot(x, np.sin(x**2))
plt.title('A simple chirp');

Or with:

%load http://matplotlib.org/mpl_examples/showcase/integral_demo.py

According to the previous cited notebook, it seems, that the inline mode can also decided from the notebook using a magic command, but with my version of ipython (0.13.2), I get an error:

In [11]: %matplotlib inline
ERROR: Line magic function `%matplotlib` not found.

Use latex in a markdown cell

Change an input cell into a markdown cell and then you may use latex:

Test $\alpha+\beta+\gamma$

Output in latex

The output can be shown with latex and mathjax using the ipython display function:

from IPython.display import display, Math
def my_show(obj): return display(Math(latex(obj)))
y = 1 / (x^2+1)
my_show(y)

ipynb format

Create a new notebook with only one cell. Name it range_10 and save:

In [1]: range(10)
Out[1]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

The file range_10.ipynb is saved in the directory. You can also download it from File > Download as > IPython (.ipynb). Here is the content of the file range_10.ipynb:

{
 "metadata": {
  "name": "range_10"
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "range(10)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "pyout",
       "prompt_number": 1,
       "text": [
        "[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]"
       ]
      }
     ],
     "prompt_number": 1
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [],
     "language": "python",
     "metadata": {},
     "outputs": []
    }
   ],
   "metadata": {}
  }
 ]
}

ipynb is just json

A ipynb file is written in json format. Below, we use json to open the file `range_10.ipynb as a Python dictionnary.

sage: s = open('range_10.ipynb','r').read()
sage: import json
sage: D = json.loads(s)
sage: type(D)
dict
sage: D.keys()
[u'nbformat', u'nbformat_minor', u'worksheets', u'metadata']
sage: D
{u'metadata': {u'name': u'range_10'},
 u'nbformat': 3,
 u'nbformat_minor': 0,
 u'worksheets': [{u'cells': [{u'cell_type': u'code',
     u'collapsed': False,
     u'input': [u'range(10)'],
     u'language': u'python',
     u'metadata': {},
     u'outputs': [{u'output_type': u'pyout',
       u'prompt_number': 1,
       u'text': [u'[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]']}],
     u'prompt_number': 1},
    {u'cell_type': u'code',
     u'collapsed': False,
     u'input': [],
     u'language': u'python',
     u'metadata': {},
     u'outputs': []}],
   u'metadata': {}}]}

Load vaucanson.ipynb

Download the file vaucanson.ipynb from the last meeting of Paris Sage Users. You can view the complete demo including pictures of automaton.

IPython notebook from a Python file

In a Python file, separate your code with the following line to create cells:

# <codecell>

For example, create the following Python file. Then, import it in the notebook. It will get translated to ipynb format automatically.

# -*- coding: utf-8 -*-
# <nbformat>3.0</nbformat>

# <codecell>

%load_ext sage.misc.sage_extension

# <codecell>

matrix(4, range(16))

# <codecell>

factor(2^40-1)

More conversion

Since release 1.0 of IPython, many conversion from ipynb to other format are possible (html, latex, slides, markdown, rst, python). Unfortunately, the version of IPython in Sage is still 0.13.2 as of today but the version 1.2.1 seems available in the new Sage git format for spkg.

by Sébastien Labbé at March 06, 2014 10:47 AM

February 21, 2014

Vince Knight

Best responses to mixed strategies in class

On Monday my Game Theory class and I took a look the connection between extensive form games and normal form games (leading to subgame perfection) which correspond to these two chapters of my course: C7 and C8 but before starting that we took another look at best responses to mixed strategies (this Chapter of my course).

We have been using this game quite a bit in class:

\[\begin{pmatrix}
(2,-2) & (-2,2)\\
(-1,1) & (1,-1)
\end{pmatrix}\]

We played it before and I blogged about it here. This is a slight modification of the matching pennies game where the 1st strategy corresponds to playing Heads (\(H\)) and the second to playing Tails (\(T\))

If player 1 (the row player) is playing a mixed strategy \(\sigma_1=(x, 1-x)\) then the utility to player 2 when playing player 2 plays $H$ (the first column) can be written as:

\[
u_2(\sigma_1,H)=-2x+1-x=1-3x
\]

and when player 2 plays $T$:

\[
u_2(\sigma_1,T)=2x-1+x=3x-1
\]

We can plot these two utilities here (using +Sage Mathematical Software System):

It is immediate to note that when \(x < 1/3\) player 2 should play $T$. In fact we can write down player 2's best response \(s_2^*\) to any \(\sigma_1\):

\[
s_2^*=\begin{cases}
H,&x < 1/3\\
T,&x > 1/3\\
\text{indifferent},&
\end{cases}
\]

Using all this I played the following game in class:


  • I handed out sheets asking students to play against 3 separate mixed strategies \(\sigma_1\in\{(.2,.8),(.9,.1),(1/3,2/3)\}\). I will refer to these 3 rounds as R1, R2 and R3;
  • Students (acting as player 2) filled in their strategies;
  • I then used the following interact to sample mixed strategies according to \(\sigma_1\):

I changed the value of \(x\) as required.

Here are the three row strategies that were sampled:


  • R1: TTTTTH 
  • R2: HHHTHH 
  • R3: TTHTTT 


This is obviously not giving the exact proportions dictated by the mixed strategy \(\sigma_1\) but that's also kind of the point. By round, here are the results.

R1

Here's a plot of the mixed strategy that was played by the entire class during round 1:


This corresponds to \(\sigma_2=(.70,.30)\), so most students seemed willing to 'trust the theory' that one should play $H$ against this mixed strategy.

4 students scored the highest score (\(7\)) and here's the strategy they all played: \(HHHHHT\), in essence they got lucky and maxed out what they could have had. If they had played the theoretical best response (to only play $H$) they would have scored: 3.

The expected value of playing the theoretical best response (always pick \(H\) against this mixed strategy is: \(6(1-3\times.2)=2.4\) (recall that \(\sigma_1=(.2,.8)\) for this round).

The mean score for this round was 1.4 and here's a distribution of the scores:



47 students who 'won' ie scored a positive score (this is a zero zum game) played \(\sigma_2=(.83,.17)\). 18 'lost' (scored a negative score) playing \(.37,.63\).

It's nice to see that there was a large amount of students who did in fact score 3.

R2

Here's a plot of the mixed strategy that was played by the entire class during round 2:


This corresponds to \(\sigma_2=(.12,.88)\), which is again pretty close to the theoretical best response.

2 students scored the highest score: 11. They once again lucked out and played the perfect response: \(TTTHTT\). If they had played the theoretical best response they would have scored 9.

The expected value of playing the theoretical best response (always pick \(T\) against this mixed strategy is: \(6(3\times.9-1)=10.2\) (recall that \(\sigma_1=(.9,.1)\) for this round).

The mean score for this round was  6.9 and here's a distribution of the scores:


60 students 'won' ie scored a positive score (this is a zero zum game) playing \(\sigma_2=(.07,.93)\). 5 'lost' (scored a negative score) playing \(.77,.23\).

R3

The third round had the students playing against a mixed strategy for which they should have been indifferent. Here's how they played:


This corresponded to \(\sigma_2=(0.62,.38)\).

There were 10 winners for this game and they scored 10 (quite a few strategy profile gave this score so I won't list them but they mainly took advantage of the fact that mostly $T$ was sampled). (The theoretical utility is in fact 0 as you can see with one of the plots above).

The mean score for this round was was .4 (which is quite close to the theoretical value of 0). Here's the distribution of the scores:




28 scored positively playing \(\sigma_2=(.64,.36)\) and 37 scored negatively playing \(\sigma_2=(.77,.23)\).

What's nice to see here is that this 3rd round is a bit more random, with an almost (stretching the definition of the word almost) equal distribution between the number of students who won and lost.

Here's a distribution of the overall scores:

The overall winner of the game (who scored the most over the 3 rounds) was Becky who played:

  • R1: \(TTHHHH\)
  • R2: \(TTTTTT\)
  • R3: \(HHTHTH\)

For a cumulative score of: 21

This was good fun to analyse and was hopefully useful to my students to see what is meant by best responses to mixed strategies. It was particularly cool to see an 'indifferent' (again stretching the definition of the word indifferent) response to the third round.

(Like with everything for this course you can find the data, analysis scripts and everything else at this github repo)

by Vincent Knight (noreply@blogger.com) at February 21, 2014 03:58 AM

February 12, 2014

William Stein

What is SageMathCloud?

The two main reasons for existence of SageMathCloud (SMC) are...

Goal 1. Increase resource for Sage: Generate a different longterm revenue stream to support development of Sage, i.e., open source mathematical software. By "different", I mean different than government and foundation grants and donations, which are relatively limited for primarily pure mathematics software development, which is what Sage specializes in. Even in my wildest dreams, it is very unlikely Sage will get more than a million dollars a year in funding (and in practice it gets a lot less); however, a successful commercial product with wide adoption has the potential to generate significantly more than a million dollars a year in revenue -- of course most would go back into the product... but when the product is partly Sage, that's fine. The National Science Foundation (and other donors) have played a major part during the last 8 years in funding Sage, but I think everybody would benefit from another funding source.

Goal 2. Increase the usage of Sage: The number of unique visitors per month to http://sagemath.org grew nicely from 2005 (when I started Sage) until Summer 2011, after which point it has remained fairly constant at 70,000 unique visitors. There is no growth at all: it was 70,332 in Jan 2011, and it was 70,449 last month (Jan 2014), both with a bounce rate of about 50%. A significant obstruction to growth is accessible, which SMC helps to address for certain users (last month the SMC website has 17,700 unique visitors with a bounce rate of about 30%).

Here's an actual email I received from somebody literally as I was writing this, which I think illustrates how SMC addresses the second goal:


Hey William,

Today I stopped by cloud.sagemath.com because
I wanted to do some computation with sage, and
cloud is announced in a big way on sagemath.org

This is after a lengthy hiatus from computing
with sage ( maybe a year ).

Using cloud.sagemath.com completely blew my
mind. At first I did not really understand
why sagenb was ditched after all the work that
went into it. But man, cloud is really a
pleasure to use !

I just wanted to share the joy :)

Thanks for all that you do !

Licensing and Reuse of the SageMathCloud Codebase

The design and coding of SageMathCloud (SMC) has been mostly supported by University of Washington (UW). Due to goal 1 above, I have been working from the start (before a line of code was written) with the commercialization/tech transfer office of UW, who (because of 1) are not enthusiastic about simply open source the whole SMC codebase, as a condition for their help with commercialization. Some of SMC is open sourced, mainly the code that runs on the VM's and some of the HTML5 client that runs on the browser. We also plan to make the HTML5 client and a mini server BSD licensed, and include them with Sage (say) as a new local graphical interface. Of course SMC builds on top of many standard open source libraries and tools (e.g., CodeMirror, Cassandra, ZFS, Node.js, etc.).

There is, however, a large amount of interesting backend code, which is really the "cloud" part of SMC, and which we do not intend to release as open source. We do intend to sell licenses (with support) for the complete package, when it is sufficiently polished, since many organizations want to run their own private SMC servers, mainly for confidentiality reasons.

Goal 2 above mainly impacts how we market SMC. However, it's easy to completely ignore Sage and still get a lot of value out of SMC. I just glanced at what people are doing as I write this, and the result seems pretty typical: latex'ing documents, some Sage worksheets, some IPython notebooks, editing a perl script.

It's important to understand how SMC is different than other approaches to cloud computing. It's designed to make certain things very easy, but they are quite different things than what "traditional" cloud stacks like OpenStack are designed to make easy. SMC is supposed to make the following easy:

  • using Sage and IPython, both command line and notebook interfaces.
  • writing a paper using LaTeX (possibly with a specific private list of collaborators),
  • editing source code, e.g., developing Python/C/etc., libraries., again possibly with realtime collaboration.
  • creating collaborative "projects", which are really a Linux account on a machine, and provide isolation from other projects.
  • backups: all data is automatically snapshotted frequently
  • high availability: failure of a machine (or even whole data center) results in at most a few minutes of lost time/work.
  • speed: files are stored on a compressed local filesystem, which is snapshotted and replicated out regularly; thus the filesystem feels fast and is scalable, as compared to a networked filesystem.

The above design goals are useful for certain target audiences, e.g., people doing Sage/Python/etc. development, teachers and students in courses that make use of Sage/Python/etc., collaborative math research projects. SMC is designed so that a large number of people can make simultaneous small use of ever-expanding resources. SMC should also fully support the "social networks" that form in this context. At the same time, it's critical that SMC have excellent uptime and availability (and offsite backups, just in case), so that people can trust it. By trust, I don't mean so much in the sense of "trust it with proprietary info", but in the sense of "trust it to not just loose all my data and to be there when I'm giving a talk/teaching a class/need to do homework/etc.".

However, exactly the above design goals are at odds with some of goals of large-scale scientific/supercomputing. The following are not design goals of SMC:

  • supercomputing -- have large data that many distributed processes operate on: exactly what people often do on supercomputers (or with Hadoop, etc.)
  • traditional "cloud computing" -- dynamically spin up many VM's, run computations on them; then destroy them. With SMC, things tend to get created but not destroyed (e.g., projects and files in them), and a full VM is much too heavy given the number of users and type of usage that we have already (and plan to have).

What happens in practice with SMC is that people run smaller-scale computations on SMC (say things that just take a few cores), and when they want to run something bigger, they ssh from SMC to other resources they have (e.g., a supercomputer account) and launch computations there. All project collaborators can see what anybody types in a terminal, which can be helpful when working with remote compute clusters.

Anyway, I hope this helps to clarify what exactly SMC actually is.

by William Stein (noreply@blogger.com) at February 12, 2014 06:41 AM

February 04, 2014

Sébastien Labbé

Dessins et calculs d'orbites avec Sage d'une fonction associée à l'algo LLL

Aujourd'hui avait lieu une rencontre de l'ANR DynA3S. Suite à une présentation de Brigitte Vallée, j'ai codé quelques lignes en Sage pour étudier une fonction qu'elle a introduite. Cette fonction est reliée à la compréhension de la densité des termes sous-diagonaux dans l'exécution de l'algorithme LLL.

D'abord voici mon fichier: brigitte.sage.

Pour utiliser ce fichier, il faut d'abord l'importer dans Sage en utilisant la commande suivante. En ligne de commande, ça fonctionne bien. Dans le Sage notebook, je ne sais plus trop si la commande load permet encore de le faire (?):

sage: %runfile brigitte.sage       # not tested

On doit générer plusieurs orbites pour visualiser quelque chose, car les orbites de la fonction sont de taille 1, 2 ou 3 en général avant que la condition d'arrêt soit atteinte. Ici, on génère 10000 orbites (les points initiaux sont choisis aléatoirement et uniformément dans \([0,1]\times[-0.5, 0.5]\). On dessine les derniers points des orbites:

sage: D = plusieurs_orbit(10000)
Note: la plus longue orbite est de taille 3
sage: A = points(D[0], color='red', legend_label='derniers')
sage: B = points(D[1], color='blue', legend_label='avant derniers')
sage: C = points(D[2], color='black', legend_label='2e avant derniers')
sage: G = A + B + C
sage: G.axes_labels(("$x$", r"$\nu$"))
sage: title = r"$(x,\nu) \mapsto (\frac{x}{(x+\nu^2)^2},\frac{\nu}{(x+\nu^2)})$"
sage: G.show(title=title, xmax=2)
/~labbe/Files/2014/orbit_x_nu.png

Un raccourci pour faire à peu près le même dessin que ci-haut:

sage: draw_plusieurs_orbites(10000).show(xmax=2)

On dessine des histogrammes surperposés de la densité de ces points une fois projetés sur l'axe des \(\nu\):

sage: histogrammes_nu(10000, nbox=10)
/~labbe/Files/2014/histo_nu.png

Le dessin semble indiquer que la densité non uniforme semble provenir simplement par les points \((x,\nu)\) tels que \(x\leq 1\).

On dessine des histogrammes superposés de la densité de ces points une fois projetés sur l'axe des \(x\) (on donne des couleurs selon la valeur de \(\nu\)):

sage: histogrammes_x(30000, nbox=5, ymax=1500, xmax=8)
/~labbe/Files/2014/histo_x.png

Le dessin semble indiquer que la densité ne dépend pas de \(\nu\) pour \(x\geq 1\).

by Sébastien Labbé at February 04, 2014 08:00 AM

January 23, 2014

Vince Knight

My Game Theory YouTube Playlist and other resources

I just added Graham Poll's awesome +YouTube playlist (http://goo.gl/UZ1Ws) to my "reading" list for my Game Theory course that I'm teaching on Monday and thought that I should also include the humble videos related to Game Theory that I have on my channel:



I also thought I could get away with making a blog post about this. The playlist above has them in 'last in first out' order but here they are in the order that I made them:

1. "An introduction to mixed strategies using Sage math's interact page."

A video that looks at the 'battle of the sexes' game and also shows of a +Sage Mathematical Software System interact.

2.  "Selfish Behaviour in Queueing Systems"

A video close to my research interests which look at the intersection of Game Theory and Queueing Theory. This video is actually voiced by +Jason Young who was doing his first research internship at the time with me and will be starting his PhD at the beginning of the 2014/2015 academic year.

3. "Pigou's Example"

A video describing a type of Game called a 'routing game'. Pigou's example is a particular game that shows the damaging effect of selfish (rational) behaviour in a congestion affected system. This video also comes with a bit of +Sage Mathematical Software System code.

4. "Calculating a Tax Fare using the Shapley Value"

This is one of my most popular videos despite the error that +Brandon Hurr pointed out at 3:51. It describes a basic aspect of Cooperative Game Theory and uses the familiar example of needing to share a taxi fare as an illustration.

5. "Using agent based modelling to identify emergent behaviour in game theory"

This video shows off some Python code that I've put online that allows the creation of a population of players/agents that play any given normal form game. There are some neat animations showing the players choosing different strategies as they go.

6. "OR in Schools - Game Theory activity"

This isn't actually a video of mine. It is on +LearnAboutOR 's channel but it's a 1hr video of one of the outreach events I do which gets kids/students using Game Theory.

7. "Selfish behaviour in a single server queue"

I built a simulation of a queue (Python code here) with a graphical representation (so you see dots going across the screen). This video simply shows what it can do but also shows how selfish behaviour can have a damaging effect in queues.

I'm going to be putting together (fingers crossed: time is short) a bunch more over the coming term.

by Vincent Knight (noreply@blogger.com) at January 23, 2014 09:59 AM

December 19, 2013

Vince Knight

Installing and using Sage just got even easier.

+Sage Mathematical Software System just moved to git!

https://plus.google.com/113421169347512599264/posts/a6avm2VUL39


This is awesome news for a variety of reasons. First of all it's great for development (you can take a look at the github repo here: https://github.com/sagemath/sage. There's a good talk about the git workflow for development by +Volker Braun here: http://www.youtube.com/watch?v=0tejiKN5ctY.

The other great reason why this is awesome is that it just got really easy to use and install Sage.

Here's a short video demonstrating everything I've done below:



If you're familiar with git then you know this but if you're not then you can simply open up a terminal on anything *nix (linux/Mac OS) and type the following:

$ cd ~
$ git clone git://github.com/sagemath/sage.git 

This basically goes to the git repository on github and clones it to a folder called sage in your home directory (if you don't have git installed you'll have to do that first).

Once you've done that you need to 'make' sage:

$ cd ~/sage
$ make

This will take a little while (it goes and gets most of what you need so it's hard to say how long as it depends on your machine) but after that you'll have Sage on your machine. If you're still in the ~/sage directory you can simply type ./sage to start sage.

You'll want to add sage to your path so that you can use it from any directory. In this video I did this by using a bit of a trick but here's I'll do something simpler: create a symbolic link to the sage file in ~/sage directory and place that symbolic link in your path (in /usr/bin/local). To do that type this:

$ ln -s ~/sage/sage /usr/local/bin/sage

Now you can type sage anywhere and you'll get sage up and running.

What's really great about all this is that if and when updates/development happens you can just git pull to get all up to date changes. Based on the +Sage Mathematical Software System post on G+: here: it looks like you can already play around with the develop branch...

Awesome.

Of course if you want the easiest way to use Sage then simply grab an account on +The Sagemath Cloud. I gave a talk last week at the Cardiff Django/Python user group about it and  +William Stein was kind enough to drop in and take some questions: http://www.youtube.com/watch?v=OYVLoTL4xt8 (sound quality isn't always great because I move around a fair bit...)

by Vincent Knight (noreply@blogger.com) at December 19, 2013 11:49 PM

December 16, 2013

William Stein

Holiday Coding the SageMath Cloud

I love the Holiday break.  I get to work on https://cloud.sagemath.com (SMC) all day again!   Right now I'm working on a multi-data center extension of http://www.gluster.org for storing a large pool of sparse compressed deduplicated ZFS image files that are efficiently replicated between data centers.  Soon SMC projects will all be hosted in this, which will mean that they can very quickly be moved between computers, are available even if all but one data center goes down, and will have ZFS snapshots instead of the current snapshot system.  ZFS snapshots are much better for this application, since you can force them to happen at a point in time, with tags, and also delete them if you want.  A little later I'll even make it so you can do a full download (to your computer) of an SMC project (and all snapshots!) by just downloading the ZFS image file and mounting it yourself. 

I'm also continuing to work on adding a Google Compute Engine data center; this is the web server parts hosted there right now https://108.59.84.126/,    but the real interesting part will be making compute nodes available, since the GCE compute nodes are very fast.   I'll be making 30GB RAM 8-core instances available, so one can start a project there and just get access to that -- for free for to SMC users, despite the official price being $0.829/hour.    I hope this happens soon. 




by William Stein (noreply@blogger.com) at December 16, 2013 09:27 AM

December 10, 2013

William Stein

The Sagemath Cloud: a minute "elevator description"

The Sagemath Cloud combines open source technology that has come out of cloud computing and mathematical software (e.g., web-based Sage and IPython worksheets) to make online mathematical computation easily accessible. People can collaboratively use mathematical software, author documents, use a full command line terminal, and edit complicated computer programs, all using a standard web browser with no special plugins. The core design goals of the site are collaboration and very high reliability, with data mirrored between multiple data centers. The current dedicated infrastructure should handle over a thousand simultaneous active users, and the plan is to scale up to tens of thousands of users as demand grows (about 100 users sign up each day right now). Most open source mathematical software is pre-installed, and users can also install their own copies of proprietary software, if necessary. There are currently around 1000 users on the site each day from all over the world.

The Sagemath Cloud is under very active development, and there is an ongoing commercialization effort through University of Washington, motivated by many users who have requested more compute power, disk space, or the option to host their own install of the site. Also, though the main focus is on mathematics, the website has also been useful to people in technical areas outside mathematics that involve computation.

by William Stein (noreply@blogger.com) at December 10, 2013 02:24 PM

December 01, 2013

Vince Knight

How to handle float error for plots near discontinuities in Sage

Last week I read this blog post by +Patrick Honner. In the post +Patrick Honner plots a graph of a function with a removable discontinuity on Desmos and when zooming in enough he got some errors.

I was waiting around to start this (ridiculously fun) Hangout on Air with a bunch of mathematicians hosted by +Amy Robinson of +Science on Google+:



While waiting I rushed to write this blog post claiming that if you did the same thing with +Sage Mathematical Software System you did not get any errors. It was quickly pointed out to me on twitter and in the comments that I just had not zoomed in enough.

I edited the blog post to first of all change the title (it was originally 'When Sage doesn't fail' but now reads 'When Sage also fails') and also to include some code that shows that the exact same error appears.

On G+, +Robert Jacobson (who's the owner of the Mathematics community which you should check out if you haven't already) pointed out that you could surely use Sage's exact number fields to avoid this error.

He put together some code and shared it with me on +The Sagemath Cloud that does exactly this. Here's a slight tweak of the code Robert wrote (hopefully you haven't changed your mind and still don't mind if I blog this Robert!):

f(x) = (x + 2) / (x ^ 2 + 3 * x + 2) # Define the function
discontinuity = -1 # The above function has two discontinuities, this one I don't want to plot
hole = -2 # The hole described by Patrick Honner

def make_list_for_plot(f, use_floats=False, zoom_level=10^7, points=1001):
count = 0 # Adding this to count how many tries fail
z = zoom_level
xmin = hole - 10/z # Setting lower bound for plot
xmax = min(hole + 10/z, discontinuity - 1/10) # Setting upper bound for plot only up until the second (messy) discontinuity
x_vals = srange(start=xmin, end=xmax, step=(xmax-xmin)/(points-1), universe=QQ, check=True, include_endpoint=True)

# If we are using floating point arithmetic, cast all QQ numbers to floating point numbers using the n() function.
if use_floats:
x_vals = map(n, x_vals)

lst = []
for x in x_vals:
if x != hole and x != discontinuity: # Robert originally had a try/except statement here to pick up ANY discontinuities. This is not as good but I thought was a bit fairer...
y = f(x)
lst.append((x, y))

return lst

The code above makes sure we stay away from the discontinuity but also allows us to swap over to floating point arithmetic to see the effect. The following plots the functions using exact arithmetic:

exact_arithmetic = make_list_for_plot(f)

p = list_plot(exact_arithmetic, plotjoined=True) # Plot f
p += point([hole, -1], color='red', size=30) # Add a point
show(p)

We see the plot here (with no errors):



To call the plots with floating point arithmetic:

float_arithmetic = make_list_for_plot(f, use_floats=True)

p = list_plot(float_arithmetic, plotjoined=True) # Plot f
p += point([hole, -1], color='red', size=30) # Add a point
show(p)

We see that we now get the numerical error:



Just to confirm here is the same two plots with an even higher zoom:



To change the zoom, try out the code in the sage cell linked here: simply change the zoom_level which was set to $10^12$ for the last two plots.

(Going any higher than $10^14$ seems to bring in another error that does not get picked up by my if statement in my function definition: Robert originally had a try except method but I thought that in a way this was a 'fairer' way of doing things. Ultimately though it's very possible and easy to get an error-less plot.)

by Vincent Knight (noreply@blogger.com) at December 01, 2013 07:57 AM

November 23, 2013

Vince Knight

When Sage also fails

+Patrick Honner wrote a post titled: 'When Desmos Fails' which you should go read. In it he shows a quirk about Desmos (a free online graphing calculator) that seems to not be able to correctly graph around the removable discontinuity  $(-2,-1)$ of the following function:

$$f(x)=\frac{x+2}{x^2+3x+2}$$

People who understand this better than me say it might have something to how javascript handles floats...

Anyway I thought I'd see how +Sage Mathematical Software System could handle this. Here's a Sage cell with an interact that allows you to zoom in on the point (click on evaluate and it should run, it doesn't seem to fit too nice embedded in my blog so here's a link to a standalone Sage cell: http://goo.gl/WtezZ4):



It looks like Sage doesn't have the same issues as Desmos does. This is probably not a fair comparison, and needed a bit more work than Desmos (which I can't say I've used a lot) to get running but I thought it was worth taking a look at :)

EDIT: IF you Zoom in more you do get the same behaviour as Desmos! I thought I had zoomed in to the same level as +Patrick Honner did but perhaps I misjudged from his picture :)

Here's the same thing in Sage (when setting $z=10^7$ in the above code):


by Vincent Knight (noreply@blogger.com) at November 23, 2013 01:55 AM

November 16, 2013

Vince Knight

Plotting complex numbers in Sage

I had a student email me overnight asking how to plot complex numbers in +Sage Mathematical Software System.

I spent a couple of minutes googleing and found various command that would plot complex functions:

f = sqrt(x) + 1 / x complex_plot(sqrt, (-5,5), (-5, 5))

This gives the following plot:


This was however not what my student was asking. They wanted to know how to plot a given set of points in the complex plain (referred to as the Argand plane). A quick google to check if there was anything in Sage pre built for this brought me to this published sheet by +Jason Grout.

I tweaked it slightly so that it was in line with the commands my students have learnt so far and also to include axes legends and put the following in to a function:

def complex_point_plot(pts): 
    """ 
    A function that returns a plot of a list of complex points. 
    Arguments: pts (a list of complex numbers) 
    Outputs: A list plot of the imaginary numbers 
    """ 
    return list_plot([(real(i), imag(i)) for i in pts], axes_labels = ['Re($z$)', 'Im($z$)'], size=30)

This function simply returns a plot as required. Here is a small test with the output:

complex_point_plot([3*I, e^(I*pi), e^(I*3*pi/4), 4-4*I])




Here is some code that will plot the unit circle using the $z=e^{i\theta}$ notation for complex numbers (and the Sage srange command):

pts = [e^(I*(theta)) for theta in srange(0, 2*pi, .1)] 
complex_point_plot(pts) 


Here is the output:


I published all this in a worksheet on our server so it's now immediately available to all our students. I'm really enjoying teaching +Sage Mathematical Software System to our students.

A Sage cell with the above code (that you can run in your browser) can be found here: http://goo.gl/jipzxV

EDIT: Since posting this +Punarbasu Purkayastha pointed out on G+ that list_plot can handle complex points right out of the box :) So the above can just be obtained by typing:

pts = [e^(I*(theta)) for theta in srange(0, 2*pi, .1)] 
list_plot(pts, axes_labels=['Re($z$)','Im($z$)'], size=30)

Learn something new everyday...

by Vincent Knight (noreply@blogger.com) at November 16, 2013 08:07 AM

November 07, 2013

Harald Schilly

When will Sagemath Cloud hit 100,000 user accounts?

The Sagemath Cloud is an online environment for computational mathematics. Get an account, log in, and your web-browser transforms into almost everything you need to study algebra, calculus, numerics, statistics, number and game theory, (computational aspects of) physics, chemistry and all other quantitative sciences. It's built around all sorts of tools and utilities offered in a proper Linux environment. All files can be shared with collaborators and edited in real-time without stepping on each other's toes. Oh wait, there is also a LaTeX editor: given you know LaTeX, qualitative sciences are also covered ;-)

But beware, this posting will only talk about the very first: accounts. Below, I'll explain how I managed to create this page, the plot below and how I let it update automatically every hour - all done solely by SMC!

SMC accounts over time


Sagemath Cloud is very public about what is happening on its servers. This stats link gives you the raw data for the current overall load of the machines. What I did over the last weeks was the following: Every hour I've downloaded this stats json file, parsed it, accumulated some interesting numbers, and stored the processed data in a CSV file.

I did all this in SMC, because it allows you to run your own crontab files. Cron periodically goes through all kind of files to figure out if there is a job to do. Just enter a magical line in your own crontab file, and the given command is run whenever you tell it to do so. You do not have to be logged in!

In my case, it's like this:
First, edit your crontab file:
$ crontab -e
Then enter this line:
0 * * * * python $HOME/get.py

The $HOME is important, because you have to specify the full path where your script sits. (By the way, if you have to start something when the SMC server reboots, use the @reboot descriptor)

So, what is get.py doing? It uses the wonderful requests Python library to retrieve and parse the stats url and extracts some data. Then it appends a line to a CSV file.

Two minutes later (the crontab line starts with a "2"), another script is called which processes this CSV file. It reads the columns via pandas, properly parses the dates into time-series, and allows me to do all sorts of analysis, transformations and plots with it. For example, the plot shown above overlays the raw time series plot with OLS fits done by statsmodels for selected time ranges (where it looks "flat"). We can see the growth trends clearly! Even more important, so far the growth increases and hence we are watching the exponential growth phase as part of the beginning of the usual logistic growth.

The other plots on this statistics page show aggregated time-series data. For example, the plot for the number of concurrent connections is also increasing and reassures that SMC is scaling well.
Concurrent connections to SMC
On this statistics page, there are also a few "dynamic" fields in the HTML content. This is done by jinja2 in such a way, that the template "stats.tmpl" contains the HTML code and "mustache"-style variables. Jinja2 renders this template with some variables and that's it.

import jinja2 as j2
env = j2.Environment(loader=j2.FileSystemLoader("."))
stats = env.get_template("stats.tmpl")
data = {
        'date' : "%s UTC" % datetime.utcnow(),
        'recent_data' : totals.ix[-24:].to_html()
}
with open("stats.html", "wb") as output:
    output.write(stats.render(**data))

The last step of the script is to actually publish the files to the webserver. That's rather straight forward. First and only once, create ssh keys via ssh-keygen without a password and then use ssh-copy-id -i ~/.ssh/id_dsa.pub name@server to copy over your keys. Subsequent ssh connections will be established without any questions asked, because the remote server knows your identity. I'm using scp to copy the files: scp *.png stats.html name@remote-server:~/target/dir/

Last but not least, when will SMC hit the 100,000 user mark? In less than 6 days the 10,000 mark should be crossed and I hope the trend continues into the upward direction.

by Harald Schilly (noreply@blogger.com) at November 07, 2013 10:04 AM

October 23, 2013

Vince Knight

Pigeon holes, Markov chains and Sagemath.

On the 16/10/2013 I posted the following picture on G+:

Here's what I wrote on that post:
For a while now there's been a 'game' going on with our pigeon holes where people would put random objects in other people's pigeon holes (like the water bottle you see in the picture). These objects would then follow a random walk around the pigeon holes as each individual would find an object in their pigeon hole and absent-mindedly move it to someone else's pigeon hole.
As such each pigeon hole could be thought of as being a transient state in a Markov chain (http://en.wikipedia.org/wiki/Markov_chain). What is really awesome is that one of the PhD students here didn't seem to care when these random objects appeared in her pigeon hole. Her pigeon hole was in fact an absorbing state. This has now resulted in more or less all random objects (including a wedding photo that no one really knows the origin of) to be in her pigeon hole.
I thought I'd have a go at modelling this as an actual Markov chain. Here's a good video by a research student of mine (+Jason Young) describing the very basics of a Markov chain:



To model the movement of an object as a Markov chain we first of all need to describe the states. In our case this is pretty easy and we simply number our pigeon holes and refer to them as states. In my example there I've decided to model a situation with 12 pigeon holes.


What we now need is a set of transition probabilities which model the random behaviour of people finding an object in their pigeon hole and absent-mindedly moving it to another pigeon hole.

This will be in the form of a matrix $P$. Where $P_{ij}$ denotes the probability of going from state $i$ to state $j$.

I could sit in our photocopier room (that's where our pigeon holes are) and take notes as to where the individual who owns pigeon hole $i$ places the various objects that appear in their pigeon hole...
That would take a lot of time and sadly I don't have any time. So instead I'm going to use +Sage Mathematical Software System. The following code gives a random matrix:

N = 12
P = random_matrix(QQ, N, N)

This is just a random matrix over $\mathbb{Q}$ so we need to do tiny bit of work to make it a stochastic matrix:

P = [[abs(k) for k in row] for row in P]  # This ensures all our numbers are positive
P = matrix([[k / sum(row) for k in row] for row in P]) # This ensures that our rows all sum to 1

The definition of a stochastic matrix is any matrix $P$ such that:
  • $P$ is square
  • $P_{ij}\geq 0$ (all probabilities are non negative)
  • $\sum_{j}P_{ij}=1\;\forall\;i$ (when leaving state $i$ the probabilities of going to all other states must sum to 1)
Recall that our matrix is pretty big (12 by 12) so we the easiest way to visualise it is through a heat map:

P.plot(cmap='hsv',colorbar=True)

Here's what a plot of our matrix looks like (I created a bunch of random matrix gifs here):



We can find the steady state probability of a given object being in any given state using a very neat result (which is not actually that hard to prove). This probability vector $\pi$ (where $\pi_i$ denotes the probability of being in state $i$) will be a solution of the matrix equation:

$$\pi P = \pi$$

To solve this equation it can be shown that we simply need to find the eigenvector of $P$ corresponding to the unit eigenvalue:

eigen = P.eigenvectors_left()  # This finds the eigenvalues and eigenvectors

To normalise our eigenvector we can do this:

pi = [k[1][0] for k in eigen if k[0] == 1][0]  # Find eigenvector corresponding to unit eigenvalue
pi = [k / sum(pi) for k in pi] # normalise eigenvector

Here's a bar plot of out probability vector:

bar_chart(pi)



We can read the probabilities from this chart and see the probability of finding any given object in a particular pigeon hole. The bar_chart function in Sage still needs a bit of work and at the moment can only print a single list of data so it automatically has the axis indexed from 0 onwards (not from 1 to 12 as we would want). We can easily fix this using some matplotlib code (Sage is just wrapping matplotlib anyway):

import matplotlib.pyplot as plt

plt.figure()
plt.bar(range(1, N + 1), pi)
plt.savefig("betterbarplot.png")

Here's the plot:


We could of course pass a lot more options to the matplotlib plot to make it just as we want (and I'll in fact do this in a bit). The ability to use base python within Sage is really awesome.

One final thing we can do is run a little simulation of our objects going through the chain. To do this we're going to sample a sequence of states (pigeon holes $i$). For every $i$ we sample a random number $0\ r\leq 1$ and find $j$ such that $\sum_{j'=1}^{j}P_{ij'}. This is a random sampling technique called inverse random sampling.

import random

def nextstate(i, P):
"""
A function that takes a transition matrix P, a current state i (assumingstarting at 0) and returns the next state j
"""
r = random.random()
cumulativerow = [P[i][0]]
for k in P[i][1:]: # Iterate through elements of the transition matrix
cumulativerow.append(cumulativerow[-1] + k) # Obtain the cumulative distribution
for j in range(len(cumulativerow)):
if cumulativerow[j] >= r: # Find the next state using inverse sampling
return j
return j

states = [0]
numberofiterations = 1000
for k in range(numberofiterations):
states.append(nextstate(states[-1],P))
We can now compare our simulation to our theoretical result:
import matplotlib.pyplot as plt

plt.figure()
plt.bar(range(1, N + 1), pi, label='Theory') # Plots the theoretical results
plt.hist([k + 1 for k in states], color='red', bins=range(1, N + 2), alpha=0.5, normed=True, histtype='bar', label='Sim') # Plots the simulation result in a transparent red
plt.legend() # Tells matplotlib to place the legend
plt.xlim(1, N) # Changes the limit of the x axis
plt.xlabel('State') # Include a label for the x axis
plt.ylabel('Probability') # Include a label for the y axis
plt.title("After %s steps" % numberofiterations) # Write the title to the plot
plt.savefig("comparingsimulation.png")

We see the plot here:


A bit more flexing of muscles allows us to get the following animated gif in which we can see the simulation confirming the theoretical result:



This post assumes that all our states are transitive (although our random selection of $P$ could give us a non transitive state) but the motivation of my post is the fact that one of our students' pigeon holes was in fact absorbing. I'll write another post soon looking at that (in particular seeing which pigeon hole is most likely to move the object to the absorbing state).

by Vincent Knight (noreply@blogger.com) at October 23, 2013 05:21 AM

October 19, 2013

William Stein

Jason Grout's description of the Sagemath Cloud

Jason Grout's description of the Sagemath Cloud:

William Stein, the lead developer of Sage, has been developing a new online interface to Sage, the Sage Cloud at https://cloud.sagemath.com. Currently in beta status, it is already a powerful computation and collaboration tool. Work is organized into projects which can be shared with others. Inside a project, you can create any number of files, folders, Sage worksheets, LaTeX documents, code libraries, and other resources. Real-time collaborative editing allows multiple people to edit and chat about the same document simultaneously over the web.

The LaTeX editor features near real-time preview, forward and reverse search, and real-time collaboration. Also, it is easy to have Sage do computations or draw gures and have those automatically embedded into a LaTeX document using the SageTeX package (for example, after including the sagetex package, typing \sageplot{plot(sin(x))} in a TeX document inserts the plot of sin(x)). A complete Linux terminal is also available from the browser to work within the project directory. Snapshots are automatically saved and backed up every minute to ensure work is never lost. William is rapidly adding new features, often within days of a user requesting them.

by William Stein (noreply@blogger.com) at October 19, 2013 10:18 PM

October 12, 2013

William Stein

"A Symphony of Cursors" (guest post by Jason Grout)

Today's post is from guest blogger, Jason Grout, lead developer of the Sage Cell Server.

The other day some students and I met to do some development on the Sage cell server. We each opened up our shared project on cloud.sagemath.com on our own laptops, and started going through the code. We had a specific objective. The session went something like this:

Jason: Okay, here's the function that we need to modify. We need to change this line to do X, and we need to change this other line to do Y. We also need to write this extra function and put it here, and change this other line to do Z. James: can you do X? David: can you look up somewhere on the net how to do Y and write that extra function? I'll do Z.

Then in a matter of minutes, cursors scattering out to the different parts of the code, we had the necessary changes written. I restarted the development sage cell server running inside the cloud account and we were each able to test the changes. We realized a few more things needed to be changed, we divided up the work, and in a few more minutes each had made the necessary changes.

It was amazing: watching all of the cursors scatter out into the code, each person playing a part to make the vision come true, and then quickly coming back together to regroup, reassess, and test the final complete whole. Forgive me for waxing poetic, but it was like a symphony of cursors, each playing their own tune in their lines of the code file, weaving together a beautiful harmony. This fluid syncing William wrote takes distributed development to a new level.

Thanks!

by William Stein (noreply@blogger.com) at October 12, 2013 09:57 PM

October 10, 2013

Doxdrum

Installation of SageManifold

Hello again! If you are looking for a Differential Geometry tool, a Sage package which is under development is SageManifold. Let’s see how to install it. Download the package using the link (currently v.0.2). I’d assume it is saved on your Downloads folder. Assuming you have SAGE installed, and you have created an alias to […]

by doxdrum at October 10, 2013 01:52 PM

October 04, 2013

William Stein

Backing up the Sagemath Cloud

The terms of usage of the Sagemath Cloud say "This free service is not guaranteed to have any uptime or backups." That said, I do actually care a huge amount about backing up the data stored there, and ensuring that you don't lose your work.

Bup

I spent a lot of time building a snapshot system for user projects on top of bup. Bup is a highly efficient de-duplicating compressed backup system built on top of git; unlike other approaches, you can store arbitrary data, huge files, etc.

I looked at many open source options for making efficient de-duplicated distributed snapshots, and I think bup is overall the best, especially because the source code is readable. Right now https://cloud.sagemath.com makes several thousand bup snapshots every day, and it has practically saved people many, many hours in potentially lost work (due to them accidentally deleting or corrupting files).

You can access these snapshots by clicking on the camera icon on the right side of the file listing page.


Some lessons learned when implementing the snapshot system

  • Avoid creating a large number of branches/commits -- creating an almost-empty repo, but with say 500 branches, even with very little in them, makes things painfully slow, e.g., due to an enormous number of separate calls to git. When users interactively get directory listings, it should take at most about 1 second to get a listing, or they will be annoyed. I made some possibly-hackish optimization -- mainly caching -- to offset this issue, which are here in case anyone is interested: https://github.com/williamstein/bup (I think they are too hackish to be included in bup, but anybody is welcome to them.)

  • Run a regular test about how long it takes to access the file listing in the latest commit, and if it gets above a threshhold, create a new bup repo. So in fact the bup backup deamons really manage a sequence of bup repos. There are a bunch of these daemons running on different computers, and it was critical to implement locking, since in my experience bad things happen if you try to backup an account using two different bups at the same time. Right now, typically a bup repo will have about 2000 commits before I switch to another one.

  • When starting a commit, I wrote code to save information about the current state, so that everything could be rolled back in case an error occurs, due to files moving, network issues, the snapshot being massive due to a nefarious user, power loss, etc. This was critical to avoid the bup repo getting corrupted, and hence broken.

  • In the end, I stopped using branches, due to complexity and inefficiency, and just make all the commits in the same branch. I keep track of what is what in a separate database. Also, when making a snapshot, I record the changed files (as output by the command mentioned above) in the database with the commit, since this information can be really useful, and is impossible to get out of my backups, due to using a single branch, the bup archives being on multiple computers, and also there being multiple bup archives on each computer. NOTE: I've been recording this information for cloud.sagemath for months, but it is not yet exposed in the user interface, but will be soon.

Availability

The snapshots are distributed around the Sagemath Cloud cluster, so failure of single machines doesn't mean that backups become unavailable. I also have scripts that automatically rsync all of the snapshot repositories to machines in other locations, and keep offsite copies as well. It is thus unlikely that any file you create in cloud.sagemath could just get lost. For better or worse, is also impossible to permanently delete anything. Given the target audience of mathematicians and math students, and the terms of usage, I hope this is reasonable.

by William Stein (noreply@blogger.com) at October 04, 2013 09:10 AM

September 22, 2013

Verónica Suaste Morales

GSoC 15th - 21th september

Week for documentation.
Decoding functions for linear codes were changed to decoder.py in sage.coding
Method 'decode' from linear_code.py was modified so now the new decoding algorithms are supported.
Examples and documentation were added to each function.

https://bitbucket.org/vsuaste/coding-theory/commits/21fdf54936c34a51434ae8c93af9a708987c8820
https://bitbucket.org/vsuaste/coding-theory/commits/b710a5169e67221344f11956c0226d1a1fc97bd7

https://bitbucket.org/vsuaste/coding-theory/commits/827bf6bdb252a5771129c62d96a5668bf4a90d18
https://bitbucket.org/vsuaste/coding-theory/commits/f714bc2d14a18c02d80a3add36742d39a2d6fa30

Also I'm still working in a function that it could be added. This is about minimal support words of the code. Here some explanation: Theory

....Getting ready the final patch with alphabetical ordered functions.

by Veronica (noreply@blogger.com) at September 22, 2013 12:21 AM

September 20, 2013

Vince Knight

Revisiting examples of computer assisted mathematics


I'm in the middle of a finishing off some last things for a brand new course we're teaching at +Cardiff University starting in October. I plan on using my first lecture to explain to our new students how important computing/programming/coding is for modern mathematicians.

I will certainly be talking about the 4 colour theorem which states that any map can be coloured using 4 colours. I'll probably demo a bit of what +Sage Mathematical Software System can do. Here's a sage cell that will demo some of this (click evaluate and you should see the output, feel free to then play around with the code).


There I've written some very basic code to show a colouring of a graph on 9 vertices.

I expect that Students might find that interesting in particular if I show colourings for non planar graphs. For example here's another cell showing the procedure on complete graph on 9 vertices:


That is just a couple of 'qwerky' things that don't really go near the complexities of the proof of the 4 colour theorem.

I took to social media in the hope of asking for more examples of cool things that mathematicians use computers for. Here's a link to the blog post but without a doubt the most responses I got was on this G+ post.

You can see all the responses on that post but I thought I'd try to compile a list and quick description of some of the suggestions that caught my eye:
  • This wiki was pointed out by +Kevin Clift which contains a large amount of great animations (like the one below) made by 'Keiff':
'This is the curse of computing: giving up understanding for an easy verification.'
  • +Joerg Fliege mentioned chess endgame tablebases which I think would be a cool thing to point out to students.
  • +David Ketcheson+Dima Pasechnik and others pointed out how computer algebra systems are more or less an everyday tool for mathematicians nowadays. When I'm finding my way through a new project I normally always have a sage terminal open to try out various algebraic relationships as I go...
There are a couple of other things that I'm not listing above (mainly because I don't know enough about them to be able to comment), but interestingly enough +Timothy Gowers posted the other day a link to a paper that he has co-authored entitled: 'A fully automatic problem solver with human-style output'. In that paper a new program is described able to produce human style proofs of theorems. A blog post that +Timothy Gowers put up a while back was actually an experiment for this paper.

I'm obviously missing a large amount of other stuff so please do let me know :)

by Vincent Knight (noreply@blogger.com) at September 20, 2013 12:30 PM

September 14, 2013

Vince Knight

For no reason whatsoever: animated gifs of random matrices

Here are some animated gifs of random matrices done in Sage:




Here's a smaller 10 by 10 matrix (the above are 500 by 500) which probably needs to come with a health warning:



The code to do this using Sage is pretty easy (it makes use of the plot method on matrices):


import os

size = 500
nbrofmatrices = 100

for i in range(nbrofmatrices):
print "Ploting matrix: %i of %s" % (i + 1, nbrofmatrices)
A = random_matrix(RR,size)
p = A.plot(cmap='hsv')
p.save('./plots/%.3d.png' % i)

print "Converting plots to gif"
os.system("convert -loop 0 ./plots/*png %sanimatedmatricesofsize%s.gif" % (nbrofmatrices, size))

Each of the three above gifs were made using different colour maps: (ie changing the cmap option).

This creates 100 random 500 by 500 matrices and uses imagemagik (that's a link to a blog post I wrote about creating these) to create an animated gif.


by Vincent Knight (noreply@blogger.com) at September 14, 2013 05:31 AM

September 13, 2013

William Stein

IPython Notebooks in the Cloud with Realtime Synchronization and Support for Collaborators

I spent the last two weeks implementing hosted IPython notebooks with sync for https://cloud.sagemath.com. Initially I had just plan to simplify the port forwarding setup, since using multiple forward and reverse port forwards seemed complicated. But then I became concerned about multiple users (or users with multiple browsers) overwriting each other's notebooks; this is a real possibility, since projects are frequently shared between multiple people, and everything else does realtime sync. I had planned just to add some very minimal merge-on-save functionality to avoid major issues, but somehow got sucked into implementing full realtime sync (even with the other person's cursor showing).

Here's how to try it out


  • Go to https://cloud.sagemath.com and make an account; this is a free service hosted on computers at University of Washington.
  • Create a new project.
  • Click +New, then click "IPython"; alternatively, paste in a link to an IPython notebook (e.g., anything here http://nbviewer.ipython.org/ -- you might need to get the actual link to the ipynb file itself!), or upload a file. 
  • An IPython notebook server will start, the given .ipynb file should load in a same-domain iframe, and then some of the ipython notebook code is and iframe contents are monkey patched, in order to support sync and better integration with https://cloud.sagemath.com.
  • Open the ipynb file in multiple browsers, and see that changes in one appear in the other, including moving cells around, creating new cells, editing markdown (the rendered version appears elsewhere), etc.
Since this is all very new and the first (I guess) realtime sync implementation on top of IPython, there are probably a lot of issues. Note that if you click the "i" info button to the right, you'll get a link to the standard IPython notebook server dashboard.

IPython development

Regarding the monkey patching mentioned above, the right thing to do would be to explain exactly what hooks/changes in the IPython html client I need in order to do sync, etc., make sure these makes sense to the IPython devs, and send a pull request. As an example, in order to do sync efficiently, I have to be able to set a given cell from JSON -- it's critical to do this in place when possible, since the overhead of creating a new cell is huge (due probably to the overhead of creating CodeMirror editors); however, the fromJSON method in IPython assumes that the cell is brand new -- it would be nice to add an option to make a cell fromJSON without assuming it is empty. The ultimate outcome of this could be a clean well-defined way of doing sync for IPython notebooks using any third-party sync implementation. IPython might provide their own sync service and there are starting to be others available these days -- e.g., Google has one, and maybe Guido van Rosum helped write one for Dropbox recently?

How it works

Earlier this year, I implemented Neil Fraser's differential synchronization algorithm, since I needed it for file and Sage worksheet editing in https://cloud.sagemath.com. There are many approaches to realtime synchronization, and Fraser makes a good argument for his.  For example, Google Wave involved a different approach (Operational Transforms), whereas Google Drive/Docs uses Fraser's approach (and code -- he works at Google), and you can see which succeeded. The main idea of his approach is eventually stable iterative process that involves heuristically making and applying patches on a "best effort" basis; it allows for all live versions of the document to be modified simultaneously -- the only locking is during the moment when a patch is applied to the live document. He also explains how to handle packet loss gracefully. I did a complete implementation from scratch (except for using the beautiful Google diff/patch/match library). There might be a Python implementation of the algorithm as part of mobwrite.

The hardest part of this project was using Fraser's algorithm, which is designed for unstructured text documents, to deal with IPython's notebook format, which is a structured JSON document. I ended up defining another less structured format for IPython notebooks, which gets used purely for synchronization and nothing else. It's a plain text file whose first line is a JSON object giving metainformation; all other lines correspond, in order, to the JSON for individual cells. When patching, it is in theory possible in edge cases involving conflicts to destroy the JSON structure -- if this happens, the destruction is isolated to a single cell, and that part of the patch just gets rejected.

The IPython notebook is embedded as an iframe in the main https://cloud.sagemath.com page, but with exactly the same domain, so the main page has full access to the DOM and Javascript of the iframe. Here's what happens when a user makes changes to a synchronized IPython notebook (and at least 1 second has elapsed):
  • The outer page notices that the notebook's dirty flag is set for some reason, which could involve anything from typing a character, deleting a bunch of cells, output appearing, etc.
  • Computes the JSON representation of the notebook, and from that the document representation (with 1 line per cell) described above. This takes a couple of milliseconds, even for large documents, due to caching.
  • The document representation of the notebook gets synchronized with the version stored on the server that the client connected with. (This server is one of many node.js programs that handles many clients at once, and in turn synchronizes with another server that is running in the VM where the IPython notebook server is running.  The sync architecture itself is complicated and distributed, and I haven't described it publicly yet.)
  • In the previous step, we in fact get a patch that we apply -- in a single automatic operation (so the user is blocked for a few milliseconds) -- to our document representation of the notebook in the iframe. If there are any changes, the outer page modifies the iframe's notebook in place to match the document. My first implementation of this update used IPython's noteobook.fromJSON, which could easily take 5 seconds (!!) or more on some of the online IPython notebook samples. I spent about two days just optimizing this step. The main ideas are:
    1. Map each of the lines of the current document and the new document to a unicode character,
    2. Use diff-patch-match to find an efficient sequence of deletions, insertions, swaps to transforms one document to the other (i.e., swapping cells, moving cells, etc.) -- this is critical to do,
    3. Change cells in place when possible.
    With these tricks (and more can be done), modifying the notebook in place takes only a few milliseconds in most cases, so you don't notice this as you're typing.
  • Send a broadcast message about the position of your cursor, so the other clients can draw it.  (Symmetrically, render the cursor on receiving a broadcast message.)




by William Stein (noreply@blogger.com) at September 13, 2013 11:48 AM

September 11, 2013

Eviatar Bach

Status of special functions in Sage

Hello,

Sorry for not posting status updates in a while, but much of what I've been working on would not be interesting to a general audience of Sage users.

The Digital Library of Mathematical Functions has a Software Index, which lists the software that implement certain mathematical function. For Sage, that list is extremely out of date. Despite having sent an email to the DLMF with updates (the editor has confirmed that the table will be updated in the next release of the DLMF), I still think it's valuable to give a more detailed outline of the status of special functions in Sage so that gaps can be filled (especially the blue and violet entries, which have patches available!). Sorry about the excessive colour; I wanted to make it easy to discern the categories.

Legend
Green: Available in Sage
Blue: Patch implementing it is available
Yellow: Partially available
Violet: Available, and patch with improvements exists
Orange: Implemented in mpmath but not in Sage
Pink: Not available in mpmath nor Sage


4 Elementary Functions
5 Gamma Function
§5.24(ii) $\mathop{\Gamma}\nolimits\!\left(x\right), x\in\mathbb{R}$
§5.24(iii) $\mathop{\psi}\nolimits\!\left(x\right), \mathop{\psi^{{(n)}}}\nolimits\!\left(x\right), x\in\mathbb{R}$
§5.24(iv) $\mathop{\Gamma}\nolimits\!\left(z\right), \mathop{\psi}\nolimits\!\left(z\right), \mathop{\psi^{{(n)}}}\nolimits\!\left(z\right), z\in\mathbb{C}$
§5.24(v) $\mathop{\mathrm{B}}\nolimits\!\left(a,b\right), a,b\in\mathbb{R}$
§5.24(vi) $\mathop{\mathrm{B}}\nolimits\!\left(a,b\right), a,b\in\mathbb{C}$ http://trac.sagemath.org/ticket/12521 would fix this, but a better solution (both for speed and precision) would be to use mpmath.
6 Exponential, Logarithmic, Sine, and Cosine Integrals
7 Error Functions, Dawson’s and Fresnel Integrals
§7.25(ii) $\mathop{\mathrm{erf}}\nolimits x, \mathop{\mathrm{erfc}}\nolimits x, \mathop{\mathrm{i}^{{n}}\mathrm{erfc}}\nolimits\!\left(x\right), x\in\mathbb{R}$
§7.25(iii) $\mathop{\mathrm{erf}}\nolimits z, \mathop{\mathrm{erfc}}\nolimits z, z\in\mathbb{C}$ $\mathrm{erfc}$ is not yet implemented for complex numbers.
§7.25(iv) $\mathop{C}\nolimits\!\left(x\right), \mathop{S}\nolimits\!\left(x\right), \mathop{\mathrm{f}}\nolimits\!\left(x\right), \mathop{\mathrm{g}}\nolimits\!\left(x\right), x\in\mathbb{R}$
§7.25(v) $\mathop{C}\nolimits\!\left(z\right), \mathop{S}\nolimits\!\left(z\right), z\in\mathbb{C}$
§7.25(vi) $\mathop{\mathcal{F}}\nolimits\!\left(x\right), \mathop{G}\nolimits\!\left(x\right), \mathop{\mathsf{U}}\nolimits\!\left(x,t\right), \mathop{\mathsf{V}}\nolimits\!\left(x,t\right), x\in\mathbb{R}$
§7.25(vii) $\mathop{\mathcal{F}}\nolimits\!\left(z\right), \mathop{G}\nolimits\!\left(z\right), z\in\mathbb{C}$
8 Incomplete Gamma and Related Functions
§8.28(ii) Incomplete Gamma Functions for Real Argument and Parameter
§8.28(iii) Incomplete Gamma Functions for Complex Argument and Parameter
§8.28(v) Incomplete Beta Functions for Complex Argument and Parameters
§8.28(vi) Generalized Exponential Integral for Real Argument and Integer Parameter
§8.28(vii) Generalized Exponential Integral for Complex Argument and Parameter
9 Airy and Related Functions
§9.20(ii) $\mathop{\mathrm{Ai}}\nolimits\!\left(x\right), {\mathop{\mathrm{Ai}}\nolimits^{{\prime}}}\!\left(x\right), \mathop{\mathrm{Bi}}\nolimits\!\left(x\right), {\mathop{\mathrm{Bi}}\nolimits^{{\prime}}}\!\left(x\right), x\in\mathbb{R}$
§9.20(iii) $\mathop{\mathrm{Ai}}\nolimits\!\left(z\right), {\mathop{\mathrm{Ai}}\nolimits^{{\prime}}}\!\left(z\right), \mathop{\mathrm{Bi}}\nolimits\!\left(z\right), {\mathop{\mathrm{Bi}}\nolimits^{{\prime}}}\!\left(z\right), z\in\mathbb{C}$ See http://trac.sagemath.org/ticket/12455
§9.20(iv) Real and Complex Zeros
§9.20(v) Integrals of $\mathop{\mathrm{Ai}}\nolimits\!\left(x\right), \mathop{\mathrm{Bi}}\nolimits\!\left(x\right), x\in\mathbb{R}$ See http://trac.sagemath.org/ticket/12455
§9.20(vi) Scorer Functions
10 Bessel Functions
§10.77(ii) Bessel Functions–Real Argument and Integer or Half-Integer Order (including Spherical Bessel Functions)
§10.77(iii) Bessel Functions–Real Order and Argument
§10.77(iv) Bessel Functions–Integer or Half-Integer Order and Complex Arguments, including Kelvin Functions Kelvin functions are not implemented. They are in mpmath however.
§10.77(v) Bessel Functions–Real Order and Complex Argument (including Hankel Functions) See http://trac.sagemath.org/ticket/15024
§10.77(viii) Bessel Functions–Complex Order and Argument
§10.77(ix) Integrals of Bessel Functions
§10.77(x) Zeros of Bessel Functions
11 Struve and Related Functions
§11.16(ii) Struve Functions
§11.16(iii) Integrals of Struve Functions
§11.16(iv) Lommel Functions
§11.16(v) Anger and Weber Functions
§11.16(vi) Integrals of Anger and Weber Functions
12 Parabolic Cylinder Functions
13 Confluent Hypergeometric Functions
§13.32(ii) Real Argument and Parameters See http://trac.sagemath.org/ticket/14896
§13.32(iii)Complex Argument and/or Parameters See http://trac.sagemath.org/ticket/14896
14 Legendre and Related Functions
§14.34(ii)Legendre Functions: Real Argument and Parameters
§14.34(iii)Legendre Functions: Complex Argument and/or Parameters
§14.34(iii)Legendre Functions: Complex Argument and/or Parameters
15 Hypergeometric Function
§15.20(ii) Real Parameters and Argument See http://trac.sagemath.org/ticket/2516
§15.20(iii) Complex Parameters and Argument See http://trac.sagemath.org/ticket/2516
16 Generalized Hypergeometric Functions and Meijer G-Function
§16.27(ii) Real Argument and Parameters See http://trac.sagemath.org/ticket/2516
§16.27(iii) Complex Argument and/or Parameters See http://trac.sagemath.org/ticket/2516
18 Orthogonal Polynomials
19 Elliptic Integrals
§19.39(ii) Legendre’s and Bulirsch’s Complete Integrals See http://trac.sagemath.org/ticket/15046
§19.39(iii) Legendre’s and Bulirsch’s Incomplete Integrals See http://trac.sagemath.org/ticket/15046
§19.39(iv) Symmetric Integrals
20 Theta Functions
21 Multidimensional Theta Functions
22 Jacobian Elliptic Functions See http://trac.sagemath.org/ticket/14996
23 Weierstrass Elliptic and Modular Functions
24 Bernoulli and Euler Polynomials Euler polynomials are not implemented.
25 Zeta and Related Functions
§25.21(ii) Zeta Functions for Real Arguments
§25.21(iii) Zeta Functions for Complex Arguments
§25.21(iv) Hurwitz Zeta Function See http://trac.sagemath.org/ticket/15095
§25.21(v) Dilogarithms, Polylogarithms
§25.21(vi) Clausen’s Integral
§25.21(vii) Fermi–Dirac and Bose–Einstein Integrals
§25.21(viii) Lerch’s Transcendent
§25.21(ix) Dirichlet L-series
26 Combinatorial Analysis
27 Functions of Number Theory
28 Mathieu Functions and Hill’s Equation
30 Spheroidal Wave Functions
33 Coulomb Functions
34 3j,6j,9j Symbols
35 Functions of Matrix Argument

by Eviatar at September 11, 2013 11:15 PM

September 09, 2013

Vince Knight

Handling data files in a Sage notebook (and some linear regression in Sage)

One of my most viewed videos on +YouTube is the following (briefly demonstrating how to import csv files in to +python):



I'm in the middle of preparing various teaching materials for an upcoming class on coding for mathematicians. I'm teaching this class in a flipped classroom (if you don't know what that is circle +Robert Talbert on G+ who posts a lot of great stuff about it) and as a result I've been screen casting a lot recently. Some of these clips are solely intended for my students (as I don't believe they'd be of interest to anyone else 'to do this exercise try and think about what your base case would be for the recursion to terminate'). I'm just starting to screen cast for the +Sage Mathematical Software System part of the course and needed to put together a little something as to how to import data in to a Sage notebook. As the above video seemed quite helpful to people I thought I'd put together another one that might be helpful.



The data file used can be found here.

Here are the lines of code used in the notebook:

import csv

f = open(DATA + 'reg', 'r') # Open the data file using the special DATA variable
data = csv.reader(f)
data = [row for row in data] # Read data using csv library
f.close()

data = [[row[2], row[1]] for row in data[1:]] # Only use data that is of interest, remove unwanted columns and 1st row

a, b = var('a, b') # Declare symbolic variables
model(t) = a * t + b # Define model

fit = find_fit(data, model, solution_dict=True) # Find fit of model to data

model.subs(fit) # View the model

p = plot(model.subs(fit), 5, 11, color='red') # Plot fit
p += list_plot(data) # Plot data
p

by Vincent Knight (noreply@blogger.com) at September 09, 2013 04:45 PM

Examples of computer assisted mathematics needed.

This blog post is a very slight expansion of this G+ post where I've asked people for example of computer assisted mathematics. I'm posting here in the hope of maximum exposure.

I'm in the middle of preparing a new course that will be teaching +Python and +Sage Mathematical Software System to our first year undegraduates at Cardiff university. As such I'd appreciate as many great example as I can get so please do let me know of anything :)

I use the term mathematics in a loose sense to include 'things' that are not necessarily proofs:

- identifying conjectures;
- visualisations;
- etc

I'm familiar with the 'well known' results that can be found on this +Wikipedia page: http://goo.gl/EjyKel

I can also recommend the book 'A=B' by  Petkovsek, Wilf and Zeilberger which you can download here: http://www.math.upenn.edu/~wilf/AeqB.

The search for prime numbers is also a pretty cool thing that I can talk about.

Even some example of things that have an obvious need of computers would be great to hear about. For example I'm quite familiar with discrete event simulation techniques that I'll be mentioning to the students. Here's a video I put together a while back showing how to use a computer to simulate a queue:


by Vincent Knight (noreply@blogger.com) at September 09, 2013 06:54 AM

September 08, 2013

Verónica Suaste Morales

GSoC 1st - 8th September

This week a new optimization for the FGLM adapted algorithm has been added.
As expected, this new algoritm is faster for computing the grobner basis of the ideal associated to the linear code in the way we need it for the decoding algorithm.
Here is part of the comparisson of the groebner basis computation.

sage: C = RandomLinearCode(9,3,GF(4,'a'))                                                                                                                    
sage: I = C.create_ideal()                                                                                                                                   
sage: %time gB = I.groebner_basis('libsingular:stdfglm')                                                                                                     
CPU times: user 370.84 s, sys: 2.36 s, total: 373.19 s                                                                                                       
Wall time: 375.74 s                                                                                                                                       
sage: %time gb=C.groebner_basis_fglm()  #FGLM adapted algorithm                                                                                                                           
CPU times: user 16.36 s, sys: 0.10 s, total: 16.45 s                                                                                                         
Wall time: 16.78 s                                                                                                                                           
   
sage: C = RandomLinearCode(10,3,GF(5))
sage: %time gb = C.groebner_basis_fglm()#FGLM adapted algorithm
CPU times: user 581.37 s, sys: 2.21 s, total: 583.58 s
Wall time: 590.37 s
sage: I = C.create_ideal()
sage: %time gB = I.groebner_basis()
CPU times: user 1331.38 s, sys: 1.14 s, total: 1332.53 s
Wall time: 1336.74 s

In order to make this comparisson was necessary implement the function for creating the ideal associated to the code.  Code

Documentation of the FGLM adapted algorithm for linear codes over finite field in the general case and all other functions, including the new decoding algorithm has been documented a tested. The link to the repository:
https://bitbucket.org/vsuaste/coding-theory/commits/e3dc1a07f7d304700053981596493646ca06428f

Here some examples in which the new decoding algorithm is faster than syndrome decoding of sage:
sage: C = HammingCode(2,GF(7))
sage: C
Linear code of length 8, dimension 6 over Finite Field of size 7
sage: v = random_vector(GF(7),C.length())
sage: %time C.decode(v)    #syndrome algorithm
CPU times: user 2.10 s, sys: 0.03 s, total: 2.13 s
Wall time: 2.19 s
(0, 4, 5, 4, 1, 2, 5, 4)
sage: %time C.decode_fq(v)  #new decoding algorithm
CPU times: user 0.81 s, sys: 0.02 s, total: 0.83 s
Wall time: 0.87 s
(0, 5, 5, 1, 4, 1, 4, 6)

sage: C = HammingCode(2,GF(11))
sage: v = random_vector(GF(11), C.length())
sage: v
(0, 10, 0, 8, 10, 0, 2, 5, 2, 10, 10, 4)
sage: %time C.decode_fq(v) # new decoding algorithm
CPU times: user 53.32 s, sys: 1.45 s, total: 54.76 s
Wall time: 56.19 s
(0, 1, 0, 10, 8, 0, 0, 2, 3, 8, 8, 5)
sage: %time C.decode(v) ###syndrome algorithm still running after hours.....

sage: C = HammingCode(3,GF(4,'a'))
sage: v = random_vector(GF(4,'a'),C.length())
sage: v
(a + 1, a + 1, 0, a + 1, 0, 1, a + 1, a + 1, a + 1, a + 1, 0, 1, a, 1, a, 0, a + 1, 0, a, a + 1, 0)
sage: %time C.decode_fq(v) #new decoding algorithm
CPU times: user 81.18 s, sys: 1.36 s, total: 82.55 s
Wall time: 84.30 s
(0, 0, 0, a + 1, 0, 1, 0, 0, 0, 0, 0, 0, a, 1, 1, 0, a + 1, 0, a, a + 1, 1)
sage: %time C.decode(v) ###syndrome algorithm still running after hours.....

by Veronica (noreply@blogger.com) at September 08, 2013 10:30 PM

September 02, 2013

William Stein

Status report: integrating IPython into https://cloud.sagemath.com -- my approach

I'm still working on the IPython notebook integration into https://cloud.sagemath.com right now. This will be a valuable new feature for users, since there's a large amount of good content out there being developed as IPython notebooks, and the IPython notebook itself is fast and rock solid.

I spent the last few days (it took longer than expected) creating a generic way to *securely* proxy arbitrary http-services from cloud projects, which is now done. I haven't updated the page yet, but I implemented code so that
   

https://cloud.sagemath.com/[project-id]/port/[port number]/...

gets all http requests automatically proxied to the given port at the indicated project. Only logged in users with write access to that project can access this url -- with a lot of work, I think I've set things up so that one can safely create password-less non-ssl web services for a groub of collaborators, and all the authentication just piggy backs on cloud.sagemath accounts and projects: it's SSL-backed (with a valid cert) security almost for free, which solves what I know to be a big problem users have.

The above approach is also nice, since I can embed IPython notebooks via an iframe in cloud.sagemath pages, and the url is exactly the same as cloud.sagemath's, which avoids subtle issues with firewalls, same-source origin, etc. For comparison, here's what the iframe that contains a single ipynb worksheet looks like for wakari.io:
  

iframe class="notebookiframe" id="" src="https://prod-vz-10.wakari.io:9014/auto_login/acd84627972f91a0838e512f32e09c9823782ec0?next=/notebook_relative/Listing 2.ipynb"

and here's what it's going to look like in cloud.sagemath:
  

iframe class="notebookiframe" id="" src="https://cloud.sagemath.com/70a37ef3-4c3f-4bda-a81b-34b894c89701/port/9100/Listing 2.ipynb"

With the wakari.io approach, some users will find that notebooks just don't work, e.g., students at University of Arizona, at least if their wifi still doesn't allow connecting to nonstandard ports, like it did when I tried to setup a Sage notebook server there once for a big conference. By having exactly the same page origin and no nonstandard orts, the way I set things up, the parent page can also directly call javascript functions in the iframe (and vice versa), which is potentially very useful.

IPython notebook servers will be the first to use this framework, then I'll use something similar to serve static files directly out of projects. I'll likely also add sage cell server and the classic sage notebook as well at some point, and maybe wiki's, etc.

Having read and learned a lot of about the IPython notebook, my main concern now is their approach to multiple browsers opening the same document. If you open a single worksheet with multiple browsers, there is absolutely no synchronization at all, since there is no server-side state. Either browser can and will silently overwrite the work of the other when you (auto-)save. It's worse than the Sage Notebook, where at least there is a sequence number and the browser that is behind gets a forced refresh (and a visible warning message about their being another viewer). For running your own IPython notebook on your own computer, this probably isn't a problem (just like a desktop app), but for a long-running web service, where a single user may use a bunch of different computers (home laptop, tablet, office computer, another laptop, etc.) or there may be multiple people involved, I'm uncomfortable that it is so easy for all your work to just get overwritten, so I feel I must find some way to address this problem before releasing IPython support. With cloud.sagemath, a lot of people will likely quickly start running ipython notebook servers for groups of users, since it would take about 1 minute to setup a project with a few collaborators -- then they all get secure access to a collection of ipython notebooks (and other files). So I'm trying to figure out what to do about this. I'll probably just implement a mechanism so that the last client to open an ipython notebook gets that notebook, and all older clients get closed or locked. Maybe in a year IPython will implement proper sync, and I can remove the lock. (On the other hand, maybe they won't -- having no sync has its advantages regarding simplicity and *speed*.)


by William Stein (noreply@blogger.com) at September 02, 2013 11:47 AM

Verónica Suaste Morales

GSoC Aug 26th - 31th

For this week the work was to implement a fglm alternative algorithm for computing grobner basis of the ideal associated to a linear code over a finite field. So far I had worked only with the binary case.

It's well known that the problem of complete decoding for a linear code is a NP-hard computational, so for codes of long length or over finite fields with big cardinality the time for decoding is big.(grows exponentially)
Still I'm working in the optimization of the code to reduce as much as possible the time of decoding.
One problem I'm facing is how to generate (in the wisest way) all vectors over the finite field of given weight and certain length. Permutations and combinations in general are operations that take long times plus the fact that the number of vectors grows exponentially with the length of the code.

Here the code. Still missing documentation and details for presentation:  Code
Here an example:

sage: C = RandomLinearCode(6,3,GF(4,'a'))
sage: v = random_vector(GF(4,'a'),C.length())
sage: v
(0, a, a, a + 1, a, a + 1)

#time it takes with syndrome algorithm vs fglm algorithm
sage: %timeit C.decode(v)  #syndrome algorithm
1000 loops, best of 3: 709 us per loop

sage: %timeit C.decode_fq(v) #using my implemented fglm
1 loops, best of 3: 609 ms per loop

#solutions of syndrome and fglm algorithm are different
sage: d = C.decode(v)
sage: d
(0, 1, 0, a + 1, a, a + 1)

sage: d1 = C.decode_fq(v)
sage: d1
(0, 1, a + 1, a + 1, a + 1, a + 1)

#check that both d and d1 belong to the same coset
sage: y1 = v-d
sage: y2 = v-d1
sage: H = C.check_mat()
sage: H*y1
(0, a + 1, a)
sage: H*y2
(0, a + 1, a)



by Veronica (noreply@blogger.com) at September 02, 2013 01:12 AM

August 28, 2013

William Stein

LaTeX in the Cloud

Motivated by work on a book and by  the stacks projects, I just wrote a new web-based LaTeX editor, which I've just released. You can try it now by making a free account at https://cloud.sagemath.com, then creating a project, and uploading or creating a .tex file, then opening it.

Features

  • Side-by-side LaTeX editing, with re-build on save (you can set the autosave interval if you want).
  • Forward and inverse search.
  • Parsing of the log file, with buttons to jump to corresponding place in tex file and pdf file.
  • Preview uses high-resolution color png's, so it will work in browsers that don't have any support for pdf.
  • The command to LaTeX your document is customizable.
  • The build process should run LaTeX, bibtex, and sagetex automatically if the log file says they need to be run; otherwise you can click a button to force bibtex or sagetex to run.
  • Scales up to large documents -- my test document is a book! -- for me sitting at home working on my 134 page book, the time from making a change and clicking "save" to when it appears in the preview pane in high resolution is less than 7 seconds.

Some advantages over https://www.sharelatex.com and https://www.writelatex.com/

Disclaimer: I'm not an expert with either of the editors mentioned above, so I may be completely wrong that the following are advantages:
  • This is free (unlimited collaborators, space right now) -- my main motivation is to increase usage of Sage
  • Sagetex is fully supported
  • Forward and inverse search: jump from point in .tex file to corresponding point in pdf and conversely (it seems the competition doesn't have this, but I bet they will implement it soon after they read this post)
  • High quality png preview with customizable resolution
  • Perfect quality embedded pdf view as well, if your browser supports embedded pdf's
  • If you need a full xterm for some reason you have it: you can run arbitrary purpose programs on that command line. This means, you can download some data (file, website, database, experimental result files, use git), process them in the most general sense of computing, and generate those files or parts of it for your LaTeX document.
  • It scales up to large documents more efficiently (in my limited tests), since I was pretty careful about using hashing tricks, parallel compute to generate png's, etc.
  • A different synchronization implementation for multiple people editing the same file at once; the others lock the editor when the network drops, or reset the docuement when the connection comes back; in real life, network connections are often dropping...
  • I put some effort into trying to make this latex editor work on iPad/Android, though you'll want to use a bluetooth keyboard since there are major issues with CodeMirror and touch still.
And some disadvantages:
  • I've put little work into properly dealing with multi-file LaTeX documents
  • The error messages are not displayed embedded in the tex document (not sure I want this though).
  • You must have a cloud.sagemath account (free) -- you can't just start editing without signing up.
  • Single file download is limited to 12MB right now, so if your PDF is huge, you won't be able to just download it -- you can scp it anywhere though using the terminal.

Behind the Scenes

As a professional mathematician, I've spent 20 years using LaTeX, often enhanced with little Python scripts I write to automate the build process somewhat. Also, I've spent way too much time over the years just configuring and re-configuring forward and inverse search under Linux, OS X, and Windows with various editors and previewers.

All the new code I wrote to implement the LaTeX editor is client-side CoffeeScript, HTML, and CSS, which builds on the infrastructure I've developed over the last year (so, e.g., it can run bash scripts on remote linux machines, etc.). Here are some specific problems I confronted; none of the solutions are what I expected two weeks ago or first tried!

Problem: How should we display a PDF in the browser

I investigated three approaches to displaying PDF files in the web browser: (1) show a bunch of images (png or jpg), (2) use a native pdf viewer plugin, and (3) use a javascript pdf renderer (namely pdf.js). Regarding (2), Chrome and Safari have a native plugin that efficiently shows a high-quality display of a complete PDF embedded in a web page, but Chromium has nothing by default. Regarding (3), the Firefox devs wrote pdf.js, which they include with Firefox by default; it looks good on Firefox, but looks like total crap in Chrome. In any case, after playing around with (2)-(3) for too long (and even adding a salvus.pdf command to Sage worksheets in cloud.sagemath), I realized something: the only possible solution is (1), for the following reasons:
  • Inverse and forward search: It is impossible to read mouse clicks, page location, or control the location of the pdf viewer plugin in some browsers, e.g., in Chrome. Thus only using a PDF plugin would make inverse and forward search completely impossible. Game over.
  • It might be possible to modify pdf.js to support what is needed for inverse and forward search, but this might be really, really hard (for me). Plus the rendering quality of pdf.js on Chrome is terrible. Game over.
  • My test document is this book's PDF, which is about 8MB in size. With PDF viewer plugins, every time the PDF file changes, the entire 8MB pdf file has to be transferred to the browser, which just doesn't scale -- especially if you replace 8MB by 60MB (say). I want people to be able to write their books and Ph.D. theses using this editor. When editing a LaTeX document, the PDF file often changes only a little -- usually only a few pages changes and everything else remains identical; only the changes should get sent to the browser, so that even a 1000-page document could be efficiently edited. This sort of thing doesn't matter when working locally, but when working over the web it is critical.
So we are stuck with (1) for the main PDF preview for a file we are actively editing using LaTeX. There are a long list of apparent drawbacks:
  • One substantial drawback to (1) for general PDF display is that there is no way to do full text search or copy text out of the PDF document. Neither of these drawback matters for the LaTeX editor application though, since you have the source file right there. Also, there's nothing stopping me from also providing the embedded PDF viewer, which has search and copy, and that's what I've done for cloud.sagemath.
  • Another potential drawback of (1) is that it takes a long time to generate jpg or png images for a large pdf file -- 5 pages is fine, but what about 150 pages? 1000 pages? I tried using ImageMagick and Ghostscript. ImageMagick is way too slow to be useful for this. Ghostscript is incredibly powerful for this, and has a wide range of parameters, with numerous different rendering devices. The solution I choose here is to: (1) generate a high quality PNG image just for the currently visible pages (and +/-1), then (2) generate medium quality pages in some neighborhood of the visible pages, then (3) generate low quality PNG's for all the other pages. All this is done in parallel, since the host VM's have many cores. Also, we compute the sha1 hashes of the previews, and if the browser already has them, don't bother to update those images. Finally, it turns out to be important to replace high quality images by lower quality ones as the user scrolls through the document, since otherwise the browser can end up using too much memory. A useful trick for the high quality pages is using ghostscript's downsampling feature, so the PDF is rendered at 600dpi (say) in memory, but output at 200dpi to the PNG.
So the Preview tab in the LaTeX editor shows a png-based preview whose quality automatically enhances as you scroll through the document. This png preview will work on any browser (for which cloud.sagemath works), irregardless of PDF plugins.
Summary: It is critical to realize exactly what problem we're trying to solve, which is viewing a PDF that is often changing locally. This is completely different than the general problem of viewing a static PDF, or editing a PDF itself, or even annotating one.

Problem: how to implement forward and inverse search in the browser

Forward and inverse search let you easily jump back and forth between a point in the source tex file and the corresponding point in the rendered PDF preview. You need this because editing LaTeX documents is not WYSIWYG (unless you are using something like Lyx or Texmacs), and without this feature you might find yourself constantly being lost, doing fulltext search through the source of pdf file, etc., and generally wasting a lot of effort on something that should be automatic. The first time I used inverse and forward search was around 2004 with the Winedt and TexShopeditors, which I think (at the time) used various heuristics to implement them, since they often didn't quite work right. I 100% assumed that I would have to do use heuristics for cloud.sagemath, and started working on a heuristic approach based on pdftotext, page percentages, etc.

Then one morning I searched and learned about synctex, which was "recently" added to the core of pdflatex. The first thing I did was run it, look at the output file and try to parse with my eyes -- that didn't work. I then searched everywhere and could not find any documentation about the format of the synctex files; however, I found a paper by the author of synctex and read straight through it. In that paper, they mention that they provide a C library and C program to parse the synctex files, and explicitly don't document the format since they don't want anybody to write programs to parse it, since they reserve the right to significantly change it. No problem -- so I just call out to the shell and run the synctex program itself with appropriate options. With a little research into scaling factors, etc., I'm able to map mouse clicks on the png to the data synctex needs to get the corresponding location in the source file. This is all actually pretty easy and provides forward and inverse search with absolutely no hacks or heuristics. Also, forward search works well since using PNG's to display the preview means one can precisely set the preview location.

Problem: making sense of the LaTeX log file

When you build a LaTeX document, tex spits out a log file full of filenames, parentheses, warnings, errors, etc., sometimes stopping to ask you questions, sometimes refusing to exit, etc. This file is NOT (by default, at least) easy for a human to read, at least not me! You can see an error message that refers to a specific location in a file, but which file that is is often listed hundreds of lines before, and you must manually balance paranthesis to figure this out. I read some documents about its format, and fortunately found this Javascript library, which parses LaTeX logs. Cloud runs pdflatex using the option -interact=nonstopmode, so that the whole file gets processed, then parses the log file, and displays first errors, then typesetting issues (overfull hboxes, etc.), and finally warnings. Each message has two buttons -- one to jump to the corresponding location in tex file, and one to jump to the location in the pdf preview. This is all easy to use, and I've found myself for the first time ever actually going through tex files and cleaning up the overfull hboxes.
The log file also says when to run sagetex and bibtex, and whether or not to run pdflatex again to update cross references, and cloud parses that and automatically runs those tools. For some reason, sagetex doesn't say "run me again" even though it should when you update existing blocks, and then you have to do it manually by clicking a button.

Summary: I hope this LaTeX editor in the Sagemath Cloud is useful to people who just want to edit tex documents, play around with sagetex, and not have to worry about configuring anything. Implementing it was fun and interesting. If you have any questions about the technical details, please ask! Enjoy.

by William Stein (noreply@blogger.com) at August 28, 2013 12:41 PM

August 25, 2013

Verónica Suaste Morales

GSoC Aug 18th - 25th

For this week I have implemented a new decoding algorithm for binary linear codes using Grobner
Basis of the ideal associated to the code.
In the binary case I have decided to use the FGLM algorithm from singular to compute the Grobner
Basis, this algorithm is very fast.  So first I have to construct the ideal associated to the code.
Then, I create a test-set for the code. And use this one to apply the descent decoding algorithm.
More details about this algorithm :  Theory
The process of computing the grobner basis and the test-set is only executed once, and after that, the decoding process is very fast.(Decoding algorithms with pre-computed information)
I have done some tests to analyze the time it takes with the different decoding algorithms and differents types of binary linear codes:
Results of experiments:  Testing Algorithm

Also I have documented the functions with examples and more. The code of this algorithm and functions is in the commit:
https://bitbucket.org/vsuaste/coding-theory/commits/4d5c8255eaec112b191eb095873eaf09e5708ed4

With this algorithm I finished algorithms for binary codes. So from now, I have started trying to generalize the algorithms for the case of codes over any finite field.
In this case we expect to gain some speed with a new version of FGLM algorithm implemented specificaly for general case of codes.

by Veronica (noreply@blogger.com) at August 25, 2013 12:01 PM

August 21, 2013

Rasmi Elasmar

Sage for Android Testing APK Now Available!

After much revision and cleaning-up, the Sage Android application is now at a point where most basic features are functional, and bug reporting, feature requests, and general feedback are needed as work on the application progresses. If you’d like to try the latest APK, you can download it here. Features are always being added (an updated APK with History and Favorites will be available soon!), and you can track the latest updates at the GitHub repository.

As always, feedback and suggestions are much appreciated. Thank you!

by Rasmi at August 21, 2013 02:30 PM

August 19, 2013

August 04, 2013

Ondrej Certik

How to support both Python 2 and 3

I'll start with the conclusion: making backwards incompatible version of a language is a terrible idea, and it was bad a mistake. This mistake was somewhat corrected over the years by eventually adding features to both Python 2.7 and 3.3 that actually allow to run a single code base on both Python versions --- which, as I show below, was discouraged by both Guido and the official Python documents (though the latest docs mention it)... Nevertheless, a single code base fixes pretty much all the problems and it actually is fun to use Python again. The rest of this post explains my conclusion in great detail. My hope is that it will be useful to other Python projects to provide tips and examples how to support both Python 2 and 3, as well as to future language designers to keep languages backwards compatible.

When Python 3.x got released, it was pretty much a new language, backwards incompatible with Python 2.x, as it was not possible to run the same source code in both versions. I was extremely unhappy about this situation, because I simply didn't have time to port all my Python code to a new language.

I read the official documentation about how the transition should be done, quoting:

You should have excellent unit tests with close to full coverage.

  1. Port your project to Python 2.6.
  2. Turn on the Py3k warnings mode.
  3. Test and edit until no warnings remain.
  4. Use the 2to3 tool to convert this source code to 3.0 syntax. Do not manually edit the output!
  5. Test the converted source code under 3.0.
  6. If problems are found, make corrections to the 2.6 version of the source code and go back to step 3.
  7. When it's time to release, release separate 2.6 and 3.0 tarballs (or whatever archive form you use for releases).

I've also read Guido's blog post, which repeats the above list and adds an encouraging comment:

Python 3.0 will break backwards compatibility. Totally. We're not even aiming for a specific common subset.

In other words, one has to maintain a Python 2.x code base, then run 2to3 tool to get it converted. If you want to develop using Python 3.x, you can't, because all code must be developed using 2.x. As to the actual porting, Guido says in the above post:

If the conversion tool and the forward compatibility features in Python 2.6 work out as expected, steps (2) through (6) should not take much more effort than the typical transition from Python 2.x to 2.(x+1).

So sometime in 2010 or 2011 I started porting SymPy, which is now a pretty large code base (sloccount says over 230,000 lines of code, and in January 2010 it said almost 170,000 lines). I remember spending a few full days on it, and I just gave up, because it wasn't just changing a few things, but pretty fundamental things inside the code base, and one cannot just do it half-way, one has to get all the way through and then polish it up. We ended up using one full Google Summer of Code project for it, you can read the final report. I should mention that we use metaclasses and other things, that make such porting harder. Conclusion: this was definitely not "the typical transition from Python 2.x to 2.(x+1)".

Ok, after months of hard work by a lot of people, we finally have a Python 2.x code base that can be translated using the 2to3 tool and it works and tests pass in Python 3.x.

The next problem is that Python 3.x is pretty much like a ghetto -- you can use it as a user, but you can't develop in it. The 2to3 translation takes over 5 minutes on my laptop, so any interactivity is gone. It is true that the tool can cache results, so the next pass is somewhat faster, but in practice this still turns out to be much much worse than any compilation of C or Fortran programs (done for example with cmake), both in terms of time and in terms of robustness. And I am not even talking about pip issues or setup.py issues regarding calling 2to3. What a big mess... Programming should be fun, but this is not fun.

I'll be honest, this situation killed a lot of my enthusiasm for Python as a platform. I learned modern Fortran in the meantime and with admiration I noticed that it still compiles old F77 programs without modification and I even managed to compile a 40 year old pre-F77 code with just minimal modifications (I had to port the code to F77). Yet modern Fortran is pretty much a completely different language, with all the fancy features that one would want. Together with my colleagues I created a fortran90.org website, where you can compare Python/NumPy side by side with modern Fortran, it's pretty much 1:1 translation and a similar syntax (for numerical code), except that you need to add types of course. Yet Fortran is fully backwards compatible. What a pleasure to work with!

Fast forward to last week. A heroic effort by Sean Vig who ported SymPy to single code base (#2318) was merged. Earlier this year similar pull requests by other people have converted NumPy (#3178, #3191, #3201, #3202, #3203, #3205, #3208, #3216, #3223, #3226, #3227, #3231, #3232, #3235, #3236, #3237, #3238, #3241, #3242, #3244, #3245, #3248, #3249, #3257, #3266, #3281, #3191, ...) and SciPy (#397) codes as well. Now all these projects have just one code base and it works in all Python versions (2.x and 3.x) without the need to call the 2to3 tool.

Having a single code base, programming in Python is fun again. You can choose any Python version, be it 2.x or 3.x, and simply submit a patch. The patch is then tested using Travis-CI, so that it works in all Python versions. Installation has been simplified (no need to call any 2to3 tools and no more hacks to get setup.py working).

In other words, this is how it should be, that you write your code once, and you can use any supported language version to run it/compile it, or develop in. But for some reason, this obvious solution has been discouraged by Guido and other Python documents, as seen above. I just looked up the latest official Python docs, and that one is not upfront negative about a single code base. But it still does not recommend this approach as the one. So let me fix that: I do recommend a single code base as the solution.

The newest Python documentation from the last paragraph also mentions

Regardless of which approach you choose, porting is not as hard or time-consuming as you might initially think.

Well, I encourage you to browse through the pull requests that I linked to above for SymPy, NumPy or SciPy. I think it is very time consuming, and that's just converting from 2to3 to single code base, which is the easy part. The hard part was to actually get SymPy to work with Python 3 (as I discussed above, that took couple months of hard work), and I am pretty sure it was pretty hard to port NumPy and SciPy as well.

The docs also says:

It /single code base/ does lead to code that is not entirely idiomatic Python

That is true, but our experience has been, that with every Python version that we drop, we also delete lots of ugly hacks from our code base. This has been true for dropping support for 2.3, 2.4 and 2.5, and I expect it will also be true for dropping 2.6 and especially 2.7, when we can simply use the Python 3.x syntax. So not a big deal overall.

To sum this blog post up, as far as I am concerned, pretty much all the problems with supporting Python 2.x and 3.x are fixed by having a single code base. You can read the pull requests above to see how to implemented things (like metaclasses, and other fancy stuff...). Python is still quite the same language, you write your code, you use a Python version of your choice and things will just work. Not a big deal overall. The official documentation should be fixed to recommend this approach, and deprecate the other approaches.

I think that Python is great and I hope it will be used more in the future.

Written with StackEdit.

by Ondřej Čertík (noreply@blogger.com) at August 04, 2013 01:48 AM

July 31, 2013

Verónica Suaste Morales

GSoC Results of GDDA new decoging algorithm

As part of the evaluation for GSoC project I present the results obtained so far.

I have implemented a new decoding algorithm GDDA (gradient descent decoding algorithm) based on the grobner representation of a code.
The idea of this algorithm is that once computed the grobner representation the decoding step is pretty straight and it doesn't takes time. That's why I save the grobner representation as attribute of the code, so you only have to compute it once.
I also modified the format of the output for grobner_representation method. Now returns a dictionary representing Matphi function instead of a List.

GDDA and Grobner representation code

Here I leave you the comparison I've made of the GDDA  with decoding algorithms already implemented in Sage such as "sydrome", "nearest neighbor" and guava.
The results are very interesting. I've tried with Hamming Codes, Extended Golay Code and random linear codes.
In the case with Hamming Code the GDDA resulted to be faster than "syndrome" ,"nearest neighbor" and "guava "algorithms, even considering the time it takes to compute grobner representation.
In case with random linear code GDDA resulted to be fasted that "syndrome" and "nearest neighbor" again even considering the time it takes to GDDA computing grobner representation.
With Extended Golay Code "syndrome" and "nearest neighbot" resulted to be faster than GDDA.

GDDA Comparison

by Veronica (noreply@blogger.com) at July 31, 2013 01:47 PM

GSoC Sixth Week (July 20-27)

This week I've been documented and tested the functions I have so far, with the final purpose of opening  the ticket. Which I have done. Here I leave you the link:
http://trac.sagemath.org/ticket/14973

About the function "covering_rad()": I merged it with the existing one "covering_radius()". By parameter "algorithm" you can indicate wich one you want to use. "Algorithm = guava" use the pre existing  method (requires GAP optional packages guava). And "algorithm = None" is my implemented functions which doesn't requires optional packages.
Code

I also changed the subroutine "insert_next" because I ordered the list every time I inserted a new element, I didn't need it. So now every time I insert a new element simply it looks for the right place with respect to the specified order.
Code

by Veronica (noreply@blogger.com) at July 31, 2013 11:15 AM

July 24, 2013

Verónica Suaste Morales

GSoC Preparing Ticket

By now I'm preparing the ticket for Sage.
Here I leave you the documentation I have so far. It suppose to be parte of LinearCode class in linear_code.py
Please let me know any comment about the way I do it. I tried to follow what developers guide says but I could be missing something.

Possible Ticket documented

I also leave you the .py file, because in the preview(above) somethings get changed. (for example you can't see the different color in commented lines...)

documentation.py

Note: About the functions I have, all of them receive a degree ordering (instance of TermOrder class).
I'm thinking change it, and instead it could receive a string with the name of the degree ordering, then I create an instance of TermOrder in the function. I think would be much more intuitive.


by Veronica (noreply@blogger.com) at July 24, 2013 11:48 AM

July 22, 2013

Verónica Suaste Morales

GSoC Report (Third Week)

This time I implemented a function that given a linear code and a degree ordering it returs the set of coset leaders.
The important thing about this functions is that after we have it, we can calculate parameters of the code almost directly, such as: newton radius, covering radius and weight distribution of the cosets. I implemented them as well.

The algorithm and expalantion goes here : Algorithm

The code of the function goes here: Code

And, I already tested it with some examples in Sage : Examples

Notes: The algorithm for computing coset leaders is almost the same for computing the grobner representation(Second Week) . So the idea is that if the grobner representation is computed, we save the set of the coset leaders as an atribute of the code. In this way we don't have to compute the set of coset leaders every time we want some of the parameters I mentioned before. We only have to do it once.

by Veronica (noreply@blogger.com) at July 22, 2013 06:03 PM

July 21, 2013

Verónica Suaste Morales

GSoC Fifth Week (14-19 July)

This week the main goal was to implement the function that given a Binary Linear Code it returns
a reduced Grobner basis. 
We want the Grobner basis to later compute the test-set(see definition in the pdf). This structure is the last thing we'll need to then start programming the new decoding algorithm, which is the main objective of this project. 
Here I attach the explanation of the algorithm and also the way I use the implemented function. With some examples tested in Sage.  Algorithm  
Here is the implementation of the function: Code

So far, I've tried to use what is already implemented in Sage. But one problem I presented with this function is that I need the permutations of one vector with given hamming weight. And I did it using "IntegerVectorsModPermutationGroup" nevertheless this part is very consuming time. And for codes which have a big length is not going to work. So, for the next week I'm planning to find another way of implementing this. At least for the binary case, which I think should be posible with bitwise operations.  

by Veronica (noreply@blogger.com) at July 21, 2013 09:46 AM

July 20, 2013

PolyBoRi Blog

GSOC 2013 project progress for weeks 0-4

I am Ioana Tamas, and this is a summary of the weekly updates of my GSOC 2013 project for Polybori.
Hopefully, the next updates will be posted in separate blog posts. My repository is: http://sourceforge.net/p/libzdd/code.

Weeks 0 &1:

I have been allocating a lot of time to learning how to work with big code bases etc., so the progress may not be very outstanding. Also, I figured that I will not be able to follow the proposed timeline, one example being the fact that updating the reference and memory allocation methods can't be done without updating the main classes and structures at the same time (which appear as later milestones).

Now, a summary of the code I have written until now:

1) I began the ZDD class, which will contain all the members and methods of the Cudd ZDD class, plus the ones inherited from Cudd classes that are of no use for Polybori (they were implemented in Cudd for handling more types of decision diagrams).

2) I started writing the structures needed for the completeness of the class definition, though I have not decided yet if I should rather have classes instead of structures. I also removed the structure members that are not useful for our purpose.

3) I have set up testing files, but I didn't include meaningful unit tests yet.

4) I added the Cudd code base in my directory, and I will gradually delete the files that are not useful until there is no Cudd left.

5) I made a namespace "libzdd" to avoid name coincidences. However, the way I did this may cause problems when I need Cudd files.

To finalize and summarize, the things I still have issues with would be:
- constructing the library using SCons, so that it accepts the boost unit test framework and takes into consideration the Cudd files
- correctly using the namespace
- choosing meaningful unit tests

Week2


During the past week, I started transforming the Cudd's structs into classes and further working on my main source and header file. After revising the timeline a little bit, I decided that I should take a break from what I started (so I temporarily commented a great part of my code) in order to work more on actually providing something that can be tested, using some temporary definitions. This part is still in progress.

Week3

The past week I temporarily kept only my Zero-Suppressed Decision Diagram class definition, using typedefs for the other relevant classes at the moment. This way, I was able to set up some boost tests and (more difficult than I expected) passing them. They test the things implemented until now, which are the constructors and some operators.

I also changed the structure of the repository, by making nice and clean header and source files for everything that I use currently use.


I also deleted a few things that I had introduced too early, but they will come back at the right time (they are saved somewhere in my computer for now).

Week4

This week, I made some significant progress in making the zdd_ZDD class (my project's main class) avoid using raw pointers, by wrapping the access to the Diagram Manager for all the constructors and operators.

In order to achieve this, I made use of boost shared pointers, and I introduced a new class.


I also introduced a unique table, with its own class (that will soon totally replace Cudd's), using boost tuples and std maps.





by Ioana Tamas (noreply@blogger.com) at July 20, 2013 07:09 PM

July 16, 2013

Eviatar Bach

Confluent hypergeometric functions in Sage

The confluent hypergeometric functions are solutions to Kummer's differential equation, $$z\frac{d^2w}{dz^2} + (b-z)\frac{dw}{dz} - aw = 0.$$ Two linearly independent solutions are functions denoted as $M$ and $U$, which I implemented symbolically in Sage ($U$ was already implemented, but only numerically).

Here are some things it can do:

sage: (hypergeometric_M(1, 1, 1) +
....: hypergeometric_U(1, 2, 1)).simplify_hypergeometric()
e + 1
sage: hypergeometric_U(1, 3, x).simplify_hypergeometric()
(x + 1)/x^2
sage: hypergeometric_M(1, 3/2, 1).simplify_hypergeometric()
1/2*sqrt(pi)*e*erf(1)
sage: hypergeometric_U(2, 2, x).series(x == 3, 100).subs(x=1).n()
0.403652637676806
sage: hypergeometric_U(2, 2, 1).n()
0.403652637676806


As far as I can tell, no open-source computer algebra system has this level of simplification of confluent hypergeometric functions; Maxima is used here, but it is not wrapped like this in Maxima. I hope this will prove useful for those working with hypergeometric functions or differential equations in Sage.

It's fairly trivial to implement the Whittaker functions in a similar way; the reason I haven't yet is because I didn't want to make the patch larger and stall review, and because the Maxima conversions are a bit trickier.

The newly implemented dynamic attributes for symbolic expressions have proved enormously useful for this ticket, as well as for the generalized hypergeometric functions and for Volker Braun's new implementation of piecewise functions. I may write a new section for the Developer's Guide which explains symbolic functions in detail, including the dynamic attributes.

I uploaded a patch at #14896; however, since it makes use of the generalized hypergeometric framework of #2516, that patch has to be merged first. Any help with review would be greatly appreciated!

by Eviatar at July 16, 2013 08:31 PM

July 12, 2013

Eviatar Bach

Google Summer of Code status update

My work was delayed for a few days due to having to deal with some registration issues at my university.

I now have all the main features of hypergeometric functions working, with an initial patch at #2516, and also fixed a bug dependency with #14858. Now I am just expanding the documentation and adding tests.

I also wrote a patch for unholding expressions with held operations, #10034, which I thought initially I would use for implementing simplification of hypergeometric functions, but ended up doing it differently. It is useful for other purposes, however.

by Eviatar at July 12, 2013 01:14 AM

July 08, 2013

Verónica Suaste Morales

GSoC Second Week (June 24-29)

The work for this week was to implement a function which, given a linear binary code, returns the Groebner representation of itselt. This Groebner representation will after, help us to compute the groebner basis of the ideal asociated to the linear code. You can check the details about this Groebner representation and algorithm here: Algorithm

The process for this function implementation was as follow:
-Understand what it is already implemented about monomial orderings
-Implement sub-functions as insert_next, next_term and member (see pdf above)
-Implement funciton groebner_representation

Code of this funcions: Code

Also, this algorithm has been tested with examples. This work you can see it at cloud sage. My project "Grobner_Project" is public but only my mentors and I can modify it.
After complete this algorithm for computing the Grober basis I'll make the patch.

by Veronica (noreply@blogger.com) at July 08, 2013 04:20 PM

July 07, 2013

Verónica Suaste Morales

GSoC First Week (June 14-21)

My project submitted to GSoC has been accepted!

Great start!!
In order to get involved with Sage development the first week of work I had the opportunity to attend Sage Days 48 .
This workshop was very inspiring for me. I meet other sage developers and their work.
I learnt about the different ways to contributing sage and the easiest way to do it.
What I most liked it was the introduction to Sage @ cloud. Here you can find a description of this project and what you can do with it: Sage Cloud

This tool has become esencial in my project.
Advantages I have with Sage cloud:
-multiple open windows (with synchronized files! :) ) So you can test what you just changed in the source code, without being opening and closing windows and files.
-all you can do with a terminal
-share the project with my mentors
-I can work everywhere without needing my laptop.
 In conclusion, so far, it has been very very convenient. And, I'm still discovering and exploring how to take advantage of this awesome tool.

I leave you the link of the presentation by William Stein in case you want(you should) to know more about it.
http://www.youtube.com/watch?v=MZ4gF33cLfQ&feature=youtu.be

by Veronica (noreply@blogger.com) at July 07, 2013 03:47 PM

July 02, 2013

Ondrej Certik

My impressions from the SciPy 2013 conference

I have attended the SciPy 2013 conference in Austin, Texas. Here are my impressions.

Number one is the fact that the IPython notebook was used by pretty much everyone. I use it a lot myself, but I didn't realize how ubiquitous it has become. It is quickly becoming the standard now. The IPython notebook is using Markdown and in fact it is better than Rest. The way to remember the "[]()" syntax for links is that in regular text you put links into () parentheses, so you do the same in Markdown, and append [] for the text of the link. The other way to remember is that [] feel more serious and thus are used for the text of the link. I stressed several times to +Fernando Perez and +Brian Granger how awesome it would be to have interactive widgets in the notebook. Fortunately that was pretty much preaching to the choir, as that's one of the first things they plan to implement good foundations for and I just can't wait to use that.

It is now clear, that the IPython notebook is the way to store computations that I want to share with other people, or to use it as a "lab notebook" for myself, so that I can remember what exactly I did to obtain the results (for example how exactly I obtained some figures from raw data). In other words --- instead of having sets of scripts and manual bash commands that have to be executed in particular order to do what I want, just use IPython notebook and put everything in there.

Number two is that how big the conference has become since the last time I attended (couple years ago), yet it still has the friendly feeling. Unfortunately, I had to miss a lot of talks, due to scheduling conflicts (there were three parallel sessions), so I look forward to seeing them on video.

+Aaron Meurer and I have done the SymPy tutorial (see the link for videos and other tutorial materials). It's been nice to finally meet +Matthew Rocklin (very active SymPy contributor) in person. He also had an interesting presentation
about symbolic matrices + Lapack code generation. +Jason Moore presented PyDy.
It's been a great pleasure for us to invite +David Li (still a high school student) to attend the conference and give a presentation about his work on sympygamma.com and live.sympy.org.

It was nice to meet the Julia guys, +Jeff Bezanson and +Stefan Karpinski. I contributed the Fortran benchmarks on the Julia's website some time ago, but I had the feeling that a lot of them are quite artificial and not very meaningful. I think Jeff and Stefan confirmed my feeling. Julia seems to have quite interesting type system and multiple dispatch, that SymPy should learn from.

I met the VTK guys +Matthew McCormick and +Pat Marion. One of the keynotes was given by +Will Schroeder from Kitware about publishing. I remember him stressing to manage dependencies well as well as to use BSD like license (as opposed to viral licenses like GPL or LGPL). That opensource has pretty much won (i.e. it is now clear that that is the way to go).

I had great discussions with +Francesc Alted+Andy Terrel+Brett Murphy+Jonathan Rocher+Eric Jones+Travis Oliphant+Mark Wiebe+Ilan Schnell+Stéfan van der Walt+David Cournapeau+Anthony Scopatz+Paul Ivanov+Michael Droettboom, +Wes McKinney, +Jake Vanderplas, +Kurt Smith+Aron Ahmadia+Kyle Mandli, +Benjamin Root and others.


It's also been nice to have a chat with +Jason Vertrees and other guys from Schrödinger.

One other thing that I realized last week at the conference is that pretty much everyone agreed on the fact that NumPy should act as the default way to represent memory (no matter if the array was created in Fortran or other code) and allow manipulations on it. Faster libraries like Blaze or ODIN should then hook themselves up into NumPy using multiple dispatch. Also SymPy would then hook itself up so that it can be used with array operations natively. Currently SymPy does work with NumPy (see our tests for some examples what works), but the solution is a bit fragile (it is not possible to override NumPy behavior, but because NumPy supports general objects, we simply give it SymPy objects and things mostly work).

Similar to this, I would like to create multiple dispatch in SymPy core itself, so that other (faster) libraries for symbolic manipulation can hook themselves up, so that their own (faster) multiplication, expansion or series expansion would get called instead of the SymPy default one implemented in pure Python.

Other blog posts from the conference:

by Ondřej Čertík (noreply@blogger.com) at July 02, 2013 12:05 PM

June 29, 2013

Eviatar Bach

Google Summer of Code: Summary of the first two weeks

It has been two weeks since I started my Google Summer of Code project for Sage.

Before the start date, I began on the benchmark framework and adding SymPy conversions to functions that lacked them. The latter will make conversions from Sage to SymPy expressions better, although the reverse is still a problem. This is because Sage doesn't parse SymPy expressions as it would for Maxima, for example; it calls a _sage_ method on the SymPy objects, which means that Sage has to rely on SymPy developers to add the conversions, which hasn't been reliable so far. Maybe this should be changed in the future.

For the first week I was fortunate to be at Sage Days 48, ostensibly organized for working on the Sage Notebook, although some, including me, worked on different parts of Sage. I really enjoyed it and learned a lot. Burcin Erocal wrote some patches adding dynamic attributes to symbolic expressions (#9556) and allowing symbolic functions to take tuple arguments (#14780), which lays the foundation for implementing hypergeometric functions smoothly. I worked on getting some special function-related patches merged, including making Bessel functions symbolic (#4102), making Airy functions symbolic (#12455), and fixing numerical evaluation of log_gamma (#12521, a palindrome ticket!). I also started working on hypergeometric functions.

This past week I have been mostly working on the hypergeometric functions (see here for an introduction), building on Fredrik Johansson's code in #2516. It needs some modification, since it was implemented in a way to work around the aforementioned limitations of symbolic expressions that have now been removed with Burcin's patches. I've been making progress and should have a patch next week.

I'm going to make sure I'm diligent with uploading and following through with patches, since I've heard this is sometimes a problem with GSoC projects.

by Eviatar at June 29, 2013 08:30 PM

June 28, 2013

Rasmi Elasmar

First Handshake: Meeting the Sage Cell Server

The first and most immediately important task of updating the Sage Android application as part of Google’s Summer of Code is to update the way the app interacts with the Sage Cell Server, which performs the calculations in the cloud so that the Android device doesn’t have to locally. Currently, the application communicates with the server through a series of HTTP requests — the client (our app) initializes the connection, then sends and receives query data back and forth from the server. This all sounds fine and efficient, but by relying solely on HTTP requests, the client ends up having to constantly poll the server to check the status of the calculation (to see if there are any updates), which is inefficient and not ideal for a light application on a light device. It’s a decent way of doing things, but the year is 2013, and the Sage Cell Server now supports WebSocket, so my first task is to update the app’s client-server interactions so that users can once again send calculations and receive results.

First, the client must make initial contact with the server in a sort of “handshake” between the two. The app will send an HTTP request, and the server will reply with connection details, at which point the WebSocket connection may be established. This seems easy to implement, but as someone who hasn’t spent much time with networking in Java, I had some difficulty getting hands to shake. The requests I was sending were seemingly correct, but the response I received was similar to that of visiting the site directly from a web browser — 405: Method Not Allowed. How could I let the server know that the app was special, and that it wanted to have meaningful interactions together? With some help from Volker, I was able to see that the issue was in the headers of my request. First, it seemed as if I should use GET instead of POST, which turned out not to be the case.

sagecellpython

The POST I was looking for (via the Python client), as revealed by WireShark.

I was advised to make use of WireShark to inspect each request I was making (using the Python sample client as my example of proper HTTP ettiquette). A POST request was being made, but it was being made to the wrong URI — I ended up using Java’s built-in URI functionality to generate one properly. At this point, all that was holding my request back from serverside acceptance was its headers. By adding Accept-Econding:identity, I finally had intialized a proper POST request, as indicated by the server’s response.

Success!

At this point, the server responds with a simple JSON object that contains a “kernel_id” and the “ws_url”, indicating that it is ready to begin our session with the information provided. Using the WebSocket URL and the kernel ID, it is simple at this point to establish a connection through one of the many fine WebSocket libraries that exist for Android. Two connections are made: an IOPub socket and a Shell socket. Once the connection is established, the rest is simply a matter of sending and receiving calculations and results in JSON form on both channels. The IOPub channel details the status of the calculation, while the Shell channel deals with the calculation and results themselves. Data is only sent and received when something has actually happened (at which point either the client or server can react accordingly), and once everything is done, the connection is closed. Especially when considering more advanced features such as Interacts, the improvement in networking and overall efficiency and simplicity from switching over to these sockets is clear.

Handling results and sending/receiving calculation data in general should be simpler now that the client/sever networking process is simplified. Now, it is a matter of getting everything to work together in order for us to run calculations from the app itself.


Although at this moment it has no functional networking capabilities (and therefore cannot perform calculations), you can download the older version of the Sage Math Android app from the Play Store and view (and even contribute to!) the source on GitHub. You can also read more about the Sage Cell Server and its interactions on the GitHub Wiki.

by Rasmi at June 28, 2013 08:10 PM

June 16, 2013

Vince Knight

Why and how: open education resources.

This is the fourth post in a series of posts reflecting on the teaching and learning in a recent course I've taught on SAS and R. This post will be quite different in nature to the previous posts which looked at students choices between SAS and/or R in their assessment:
Here I would like to talk about how I deliver teaching materials (notes, exercises, videos etc) to my students.

All of the teaching materials for the course can be found here: drvinceknight.github.io/MAT013/.

How they got there and why I think it's a great place for them to be will be what I hope to discuss...

A Virtual Learning Environment that was not as good as alternatives.

At +Cardiff University we have a VLE provided that is automatically available to all our students that all lecturers are encouraged to use. So when I started teaching I diligently started using the service but it had various aspects that did not fit well with my workflow (having to upload files on every change, clunky interface that actually seemed optimised for IE and various other things). It was also awkward (at the time, I believe that this has been addressed now) for students to use the environment on smart phones etc...

As an alternative, I setup a very simple website using google sites and would use +Dropbox's public links to link pdfs and other resources for my students. An example of such a delivery is these basic Game Theoretical materials. This gave me great control, I no longer had to mess around with uploading versions of files, every change I made was immediately online and also as the site was pretty simple (links and pdfs) it was easily accessible to students on all platforms (I could also include some YouTube videos).

An immediate consequence of this approach is that my materials are all publicly available online.

To anyone, our students or not. The first thing I did was check with +Paul Harper: the director of the MSc course that I was only teaching on at the time that this was ok. We chatted about it a bit and were both happy to carry on. My main train of thought was that there are far better resources already available online so mine might as well be. (I've subsequently checked with our School's director of learning and teaching and there's no internal regulations against it which is nice to know about +Cardiff University)

There is a huge amount of talk about open access in research (I won't go in to that here) but less so to some extent in teaching. I did find this interesting newspaper article that ponders as to "Why don't more academics use open educational resources?". This offers a good general discussion about open education resources.

I would feel very very humbled if anyone chose to actually use my resources. I'm at the early part of my career and am still learning so I don't think that will happen anytime soon but there is another more important benefit to having my teaching stuff online for everyone.

I always post about any new courses I'm working on, on G+ and am grateful to get a fair bit of feedback from other academics around the world. This in itself gives me a certain level of confidence in front of my students who know that what I'm teaching them is verifiable by anyone in the world. I've often changed a couple of things based on feedback by other academics and I think that's brilliant.

To some extent my teaching resources are not just reviewed by a couple of peers in my university but also by anyone online who might be interested in them.

(It would be great if research worked this way too)

Through G+ (I've posted about how awesome a tool G+ is as an academic personal development tool) I learnt about git and github. If you don't know about git watch this video by +Zoë Blade is very helpful:


After a while I jumped in and starting using it. After a little longer while I found out that you can use github to host a website:


Using this I it is really easy to put together a very basic website that has all the teaching materials. The added benefit is that the materials are now all in a github repo which opens them up even more (using dbox, only the pdf files were in general in view) whereas now everything is (md, tex source files etc...) and theoretically if anyone wanted to they could pull etc...

I'm certainly not the first person to put teaching stuff up on github, (watching people like +Dana Ernst+Theron Hitchman and various others do it is what made me jump in).

The github repo for my R and SAS course can be found here and here are some other teaching things I have up on github (with the corresponding webpage if I've gotten around to setting it up):
To finish off here are the various reasons I have for putting my teaching stuff up on github:
  • Openness:
    • my students know that this is viewable by everyone which hopefully gives the resources a level of confidence;
    • people on G+ and elsewhere are able to point out improvements and fixes (if and when they have time);
  • Access: the sites are really simple (basic html with links) so they can be viewed on more or less anything;
  • Ease of use: I don't have to struggle to use whatever system is being used. If it's an option I kind of refuse to use stuff that makes me less efficient (an example of this is our email system: I use gmail). At the moment the system I like is github + git.
I wrote a blog post (which is the most read thing I've ever written - online or offline - I think) showing how to combine various things like tikz, makefiles, +Sage Mathematical Software System etc to automate the process of creating a course site so I'll put a link to that here.

by Vincent Knight (noreply@blogger.com) at June 16, 2013 05:43 AM

June 04, 2013

PolyBoRi Blog

GSOC 2013 project for Polybori

I am Ioana-Maria Tamas and for this year's Google Summer of code, I will be working for lmonade's project: "Binary decision diagrams for Boolean polynomial rings". The main details of the project are the ones  below, but small modifications may occur.

  • Abstract

Zero-suppressed binary decision diagrams are used by Polybori for efficiently representing Boolean polynomials. At the moment, they are manipulated via CUDD, which is not specialized on this type of diagrams and only uses C in the implementation. The goal of the project is implementing an independent library in C++, that is specialized on  zero-suppressed binary decision diagrams.

  • Objective

There are no major problems in the current method used to deal with decision diagrams, but implementing the new library will definitely increase Polybori’s usability and efficiency. The things that will be reimplemented are the reference counting, the caching management, the diagram manager and then the operators and methods from the decision diagram class.

  • Deliverables

The final product will be a well-documented, well-tested and easy to build library, that will help Polybori manipulate Boolean Polynomials without using CUDD in the background, and in a more efficient and specialized way.

  • Timeline


    Time Frame
    Milestone
    17 June - 24 June
    Prepare unit tests for a wrapper class that works with CUDD in the background
    25 June - 5 July
    Implement a C++ reference counting method
    6 July - 11 July
    Add the new method to the wrapper and test
    12 July - 26 July
    Implement a C++ cache management method
    27 July - 2 August
    Add the new method to the wrapper and test
    Prepare for midterm evaluation
    3 August - 16 August
    Re-implement the DdManager, DdNode and DdChild and test them
    17 August - 31 August
    Implement a C++ ZDD class, with all the operations and methods available in CUDD
    1 September - 9 September
    Write unit tests that are independent from CUDD
    10 September - 15 September
    Optional: Make an independent C++ library using the new implementations (with a proper build system)
    Test everything without CUDD
    16 September - 23 September
    Finalize documentation
    Prepare for final evaluation

by Ioana Tamas (noreply@blogger.com) at June 04, 2013 09:57 PM

June 02, 2013

May 29, 2013

Eviatar Bach

Google Summer of Code introduction

I'm excited to have been chosen for Google Summer of Code 2013! I'm going to be working on Sage under the mentorship of Flavia Stan and Burcin Erocal. You can see my proposal on Google Docs.

I'm going to be posting updates on the project on this blog.

by Eviatar at May 29, 2013 11:26 PM

May 28, 2013

Harald Schilly

Sage announces 3 GSoC projects

Sage is pleased to announce three Google Summer of Code projects for 2013. They focus on speed improvements of symbolic functions, simplifying the distribution and installation procedure on Debian/Linux and ubiquitous accessibility of Sage on the Android platform.


Mathematical Functions Library

Eviatar Bach –  University of British Columbia in Vancouver, Canada
(Mentor: Flavia Stan, Backup: Burcin Erocal)


Sage interfaces with multiple third-party libraries, such as MPFR, GSL, GP/PARI, mpmath, and Maxima, for numerical evaluation of special functions. There are significant discrepancies between these backends in the performance for numerical approximations of the same expression. An initial benchmark reveals, for example, that calculating spherical_bessel_J(1, 5.2) with SciPy is over 100 times faster than with Maxima.

The project has the following goals:
  1. develop a benchmark framework to determine which backend should be used by default to evaluate a special function over a specific domain,
  2. create symbolic wrappers for all the special functions that can be evaluated numerically by a package included in Sage,
  3. create a data structure for generalized hypergeometric functions and extend the symbolic wrappers to obtain representations in terms of generalized hypergeometric functions when possible,
  4. implement closure properties for holonomic functions as a next step to improve the symbolic processing of special functions in Sage.


Overall improvement of the Sage Android application

Rasmi Elasmar
(Mentor: Volker Braun, Backup: Harald Schilly)


Although there are already some existing efforts, Sage is still not easily accessible from the Android platform. The Sage Cell client/server infrastructure is an already existing step towards running Sage on a server and communicating back the results. The aim of this proposal is to fix, improve and update the Sage Android application to include new features and functionality, as well as an improved interface for simpler and improved usability. Android's new "Holo" style, sharing of calculations and results, and much more waits to be realized on Android for Sage.


Get Sage ready for Linux distributions

Felix Salfelder – Goethe Universität, Frankfurt, Germany
(Mentor: Tobias Hansen, Julien Puydt, Jeroen Demeyer & John Palmieri )

The aim of this project is to detach the build process of Sage ("the software") from Sage ("the distribution"). The goal is a build system that works within the context of Sage as well as for any GNU/Linux distribution that ships the dependencies for Sage. Distributions that already ship Sage packages or plan to do so are Fedora and Debian. This project is an important step towards making Sage packages in GNU/Linux distributions feasible.



Sage warmly welcomes all three new students and wishes them all the best to learn something new and make an impact in Sage's future developments!

by Harald Schilly (noreply@blogger.com) at May 28, 2013 11:45 AM

May 21, 2013

Vince Knight

Probability of saying 'yes' to academic responsibilities

I've just read a great post by +Adriana Salerno: Learning to say no.

In that post Adriana discusses how in mathematics (and I'm sure a bunch of other/most fields) one needs a long period of uninterrupted time to work on Research she links to this Big Bang Theory clip:



She also however talks about how as an early career researcher it's important to take opportunities for responsibilities as and when they come. This is something that rings very true to me. Growing up I played a lot of rugby and basically had a "Say yes to coach" attitude ("Vince, you're slow, run sprints" - "Yes coach", "Vince, you're going to sit on the bench this week" - "Yes coach" etc... - Although I actually said "Oui Monsieur" as all my rugby was played in France, but I digress...).

I've kind of taken that attitude in to the early days of my career (I'm still a 'young pup' academia wise) but I also am very grateful of every opportunity that gets sent my way (I'm very lucky to be sitting on various committees, the editorial boards for a couple of journals and am in the middle of preparing not 1 but 2 brand new courses which is a great opportunity as opposed to being given others people's courses!).

Having said that, as Adriana points out in her blog it's important to find a balance so that I can also do some research.

The point of this post is not to say that I've figured out how to do that but to post this xkcd style graph that I made using this package on github: XKCDify.

If this was done by Randall Munroe the Alt Text would be far better...


I'm about at the point where the solid line meets the dashed line (ie the "unkown" for me). I suspect that I'm still being quite optimistic as to how low the probability of saying yes will go for me as I still generally do as I'm told and appreciate the opportunities greatly :)

In Adriana's post she talks about a "research day", I might try to be strict on that...

PS Here's another similar kind of graph that +Paul Harper (my head of research group) put together when he was actually looking back a bit on his 10 years in Academia.

(If anyone's interested here's the repo with the code I used to get that plot, I actually used +Sage Mathematical Software System 's find_fit command to fit a quintic to the few points I wanted to have on there... There might be a better way to do that though...)

by Vincent Knight (noreply@blogger.com) at May 21, 2013 07:16 AM

May 05, 2013

Vince Knight

Counting semi magic squares using generating functions and Sage.

+John Cook has written a couple of posts recently that caught my eye and they kind of start with this one: Rolling dice for normal sample: Python version.

Go check it out it's pretty cool but basically John Cook shows how to use the Sympy package to obtain coefficients from generating functions in python.

I thought I'd do something similar but with +Sage Mathematical Software System (which is built on python and would be using Sympy on this occasion) but also on a slightly different topic: Semi Magic Squares.

A semi magic square is defined as a square matrix whose row and columns sums are equal to some constant $r$.

So for example semi magic squares of size 3 will be of the form:

$\begin{pmatrix} a&b&r-a-b\\ c&d&r-c-d\\ r-a-c&r-b-d&a+b+c+d-r \end{pmatrix}$

(with $0\leq a, b, c, d, a+c, b+d, a+b, c+d \leq r \leq a+b+c+d$ and $a,b,c,d \in \mathbb{Z}$)

Using this it can be shown that $|\text{SMS}(3,r)|=\binom{r+4}{4}+\binom{r+3}{4}+\binom{r+2}{4}$ (where SMS(n,r) is the set of semi magic squares of size $n$ and line size $r$). A really nice and accessible proof of this is in this paper by Bona: A New Proof of the Formula for the Number of 3 by 3 Magic Squares.

That is however not the point of this blog post :) In fact I'm just going to show the Sage code needed to enumerate the set SMS(n,2).

There's a nice result obtained in A combinatorial distribution problem by Anand Dumir and Gupta in 1966 that expresses the number of elements in SMS(n,r) as a generating function:

$\frac{e^{\frac{x}{2}}}{\sqrt{1-x}}=\sum_{n=0}^{\infty}|\text{SMS}(n,2)|\frac{x^n}{n!^2}$

In the rest of this post I'll now show how to use +Sage Mathematical Software System to get the numbers of Semi Magic Squares of line sum 2 and size $n$.

First of all we need to define the above generating function. In Sage this is easy and as we're going to use the variable $x$ don't need to declare it as a symbolic variable (this is by default for $x$ in sage).

Here's the natural way of doing this:
sage: f(x) = e ^ (x / 2)/(sqrt(1 - x))

We can get a plot of $f(x)$ easily in Sage:
sage: plot(f(x),x,-10,1)

which gives:



To obtain the taylor series (up to $x^5$) expansion of this function we simply call:
sage: taylor(f(x),x,0,5)

Which returns:
69/160*x^5 + 47/96*x^4 + 7/12*x^3 + 3/4*x^2 + x + 1

Recalling the generating function formulae above we see that the size of $|\text{SMS}(5,2)|$ is: $(5!)^2\times \frac{69}{160}=6210$.

We can use usual python syntax to define a function that will return $|\text{SMS}(n,2)|$ for any $n$:
sage: def SMS(n,2):
....: return taylor(f(x),x,0,n).coeff(x^n)*(factorial(n)^2)

This makes use of the coeff method on the polynomials in sage.

To get $|\text{SMS}(n,2)|$ for $0\leq n \leq 10$ we can use:
sage: print [SMS(n,2) for n in range(11)]

which gives:
[0, 1, 3, 21, 282, 6210, 202410, 9135630, 545007960, 41514583320]

If we throw that sequence in to the OEIS (my best friend during my PhD) we indeed get the correct sequence: A000681 (if you don't know about the OEIS it's awesome, be sure to click one of those two links).

In my previous post today I just used pure Python to carry out simulations to estimate $\pi$ but there are certain things that I find Sage to be better suited for (like generating functions :) ).

In my PhD I looked at a set of objects similar to Semi Magic Squares called: Alternating Sign Matrices.

You can find some of the arguments I briefly talk about here in my thesis (the thought of someone actually reading it is terrifying). I used Mathematica throughout my PhD but (without wanting to sound like a sales pitch) if I had known about Sage (not a sponsor) at the time I would have probably used Sage.

(All the math on this page is rendered using http://www.mathjax.org/ which I'm not entirely sure to have configured correctly for +Blogger; if some of the math hasn't rendered try reloading the page...)

by Vincent Knight (noreply@blogger.com) at May 05, 2013 11:18 AM