May 11, 2012

Harald Schilly

Sage goes GSoC

This year Sage is part of the GSoC program for the first year. This is an exciting new opportunity to get new contributors on board and to aim for new features. We got nearly 60 project proposals, of which about 8 or more were really excellent ones. We got 3 slots to fill and therefore it was a really hard decision which projects to pick.  We are curious how this will turn out during the summer.


The projects are:

The actual work starts on May 21st and all three students will give regular updates through their blogs or by posting to the mailing list to keep others in the loop what they are doing. All regular developers are invited to give them feedback and help them if they encounter any problems.

by hsy (noreply@blogger.com) at May 11, 2012 12:22 PM

May 10, 2012

Jan Pöschko

Existing occurrences of lattices

As already pointed out by Dmitrii Pasechnik in the comments to the project proposal, one of the first things to do is probably to search for existing occurrences of lattices in Sage.

A bold search for "lattice" in *.py and *.pyx files in sage-5.0b11/sage/sage yields 3,946 matches. Here's what they are about (places with potential overlap with or use for our new Lattice class in bold):
  • algebras.quatalg.basis_for_quaternion_lattice: returns a basis for the `\\ZZ`-lattice in a quaternion algebra spanned the given generators. It uses _hnf_pari (which calculates the upper triangular Hermite normal form) of an integer matrix.
  • categories.*: "lattice" in the sense of a partially ordered set with unique supremum.
  • coding.code_constructions.ToricCode
  • combinat.*: poset.
  • crypto.lattice.gen_lattice: generates different types of lattice bases of row vectors relevant in cryptography.
  • geometry: lattice polytopes, integral points. Using the PALP library.
  • group.abelian_gps.abelian_group.subgroup_reduced: find small basis.
  • homology.simplicial_complex.lattice_paths: 
  • interfaces.r: poset.
  • libs.fplll: wrapper for the floating-point LLL implementation in fplll. This library also contains a floating-point implementation of the Kannan-Fincke-Pohst algorithm that finds a shortest non-zero lattice vector, and the Block Korkin-Zolotarev (BKZ) reduction algorithm.
  • libs.ntl.ntl_mat_ZZ.LLL (and variants): LLL implementation with different precisions (up to arbitrary precision floats).
  • libs.pari.gen: period lattices of elliptic curves.
  • matrix.matrix_integer_dense: LLL and BKZ functions (using fplll or ntl).
  • modular.abvar.*: calculate lattices that define various subgroups etc.
  • modular.modsym.ambient.ModularSymbolsAmbient.integral_structure
  • modular.quatalg.brandt: orders and ideals in quaternion algebras.
  • modular.etaproducts.EtaGroup_class.basis: uses LLL algorithm.
  • modules.fg_pid.fgp_module: lattice used in example code.
  • modules.free_module.FreeModule_generic_pid.index_in: lattice_index.
  • quadratic_forms
  • rings.number_field.CyclotomicField._positive_integral_elements_with_trace: enumerate lattice points.
  • rings.number_field.order: lattice index.
  • rings.number_field.totallyreal_data.hermite_constant: nth Hermite constant.
  • rings.number_field.totallyreal_rel.integral_elements_in_box: uses geometry.lattice_polytope.LatticePolytope to find points numerically.
  • rings.number_field.enumerate_totallyreal_fields_prim: find a minimal lattice element.
  • rings.polynomial_modn_dens.small_roots: uses LLL algorithm.
  • rings.qqbar: poset (lattice of algebraic extensions).
  • rings.rational_field: example code using period lattice of elliptic curves.
  • schemes.elliptic_curves: period lattice of elliptic curves.
  • schemes.generic.*: lattice polytopes.
Do you know of any other relevant places where lattices are used in Sage? Any ideas on how any of these pieces of code might benefit from a new Lattice class?

by Jan Pöschko (noreply@blogger.com) at May 10, 2012 02:12 AM

May 07, 2012

Jan Pöschko

Welcome

Welcome to the blog about the Google Summer of Code project Lattices for Sage.

by Jan Pöschko (noreply@blogger.com) at May 07, 2012 02:16 AM

April 02, 2012

David Joyner

Sestinas and Sage

According to [1], a sestina is a highly structured poem consisting of six six-line stanzas followed by a tercet (called its envoy or tornada), for a total of thirty-nine lines. The same set of six words ends the lines of each of the six-line stanzas, but in a shuffled order each time. The shuffle used [...]

by wdjoyner at April 02, 2012 08:04 PM

March 18, 2012

Martin Albrecht

Sage accepted to Google Summer of Code

For the first time, Sage was accepted as a mentoring organisation for the Google Summer of Code. We are now in the “Would-be student participants discuss application ideas with mentoring organizations” stage so get in touch if you want to … Continue reading

by martinralbrecht at March 18, 2012 11:35 AM

March 17, 2012

PolyBoRi Blog

Releasing PolyBoRi 0.8.1

Dear PolyBoRi users,
we are proud to announce that PolyBoRi 0.8.1 is available at
  https://sourceforge.net/projects/polybori/files/polybori/0.8.1/

We continue to follow the two major goals of the 0.8 branch: stability and compatibility. In addition, for improved supporting of external developers we partially restructured the sources of libpolybori_groebner.


The full changelog of release 0.8.1 reads as follows:
* fallback m4ri updated to release 20111203
* Merged Cudd with cudd 2.5.0
* Prefixed patched Cudd's function with pbori_
* ipbori is now an all-python script with fallback to plain python if
  IPython is not available
* Python interface utilizes shared libraries
* PolyBoRi's shared libraries are found via relative rpath
  (relative install-name on Darwin)
* introducing DEFAULT_*FLAGS
* Install/InstallAs fixes permissions
* ipbori -t runs PolyBoRi's doctests
* internal refactoring of ReductionStrategy and GroebnerStrategy started
* avoiding long long

Binary packages are available for recent OpenSuSE, Mandriva and Fedora as well as Debian and Ubuntu distributions. Just point your package manager here:
  http://download.opensuse.org/repositories/home:/AlexanderDreyer/PolyBoRi/

In addition the corresponding communities bundled preliminary spkgs
  http://trac.sagemath.org/sage_trac/ticket/12655
and Gentoo ebuild scripts are available via the lmonade project at http://www.lmona.de/ .

With best regards,
  The PolyBoRi Team

by Alexander Dreyer of PolyBoRi (noreply@blogger.com) at March 17, 2012 12:35 AM

March 10, 2012

William Stein

Sage and Python 3?

Have you ever wondered how difficult it might be to switch Sage from Python 2.7 to Python 3.x?  Some students in my Sage course made a webpage that summarizes the Python 3 support status of the Python packages on which Sage depends.

Interesting examples:
  • Matplotlib doesn't support Python 3.x at all yet.
  • Mercurial doesn't support Python 3.x, and they don't have any plans to do so. (Which is another point in favor switching Sage to GIT.)

by William Stein (noreply@blogger.com) at March 10, 2012 07:57 AM

March 08, 2012

William Stein

How to get access to Magma

Despite Sage doing a lot, people still write to me asking about how to get access to Magma. Here is my advice:
  1. You can buy a copy of Magma here for only $1180.
  2. If your calculation will take less than 20 seconds (?), you can do it over the web for free. (Historical note: I wrote the first version of that website.)
  3. In Sage, use the command magma_free('some string'), though I just checked and the Magma calculator has changed in a way that breaks this command again. I'll be posting a patch to fix this in Sage.
  4. Most people I know who have Magma do not buy it, but get it in exchange for contributing to Magma (I used to get it free for that reason). Study the list of
    past and current contributors to Magma and ask the one you know the best. This is a list of people who definitely have a copy of Magma and know how to use it.

by William Stein (noreply@blogger.com) at March 08, 2012 02:57 PM

March 05, 2012

PolyBoRi Blog

Wanted: Applied Computer Algebra in Tech

Currently I'm about to support a student's projects for collecting and categorizing examples arising from technical applications of computer algebra. Our benchmark examples from formal verification of digital systems and cryptoanalysis mark a (proper) subset of these use cases. But from my experience such projects prosper from a wider example base.

Are there readers around which use computer algebra techniques in industrial, biological, medical, or chemical applications? Would you like to direct our attention to some examples from your area? Of course, one of our goals is proper citing of the contributed stuff as well as mentioning the contributor itself.

If you are interested to drop us a message at industrial@3r4u.de.


My best,
  Alexander

by Alexander Dreyer of PolyBoRi (noreply@blogger.com) at March 05, 2012 10:43 PM

February 27, 2012

Doxdrum

Table of all possible irreps of Lie groups

Last weekend, I wrote a program in SAGE that list all possible irreps of a Lie group, one specified the group (in Cartan’s classification) and the maximum sum of the Dynkin labels. Check the notebook in here. Since I’m not a programmer, please feel free to leave comments, specially if you find ways to optimize [...]

by doxdrum at February 27, 2012 08:31 PM

January 31, 2012

Alasdair McAndrew

Easy Lagrangian interpolation

As everybody knows, the formula for Lagrangian interpolation though points , where all the values are distinct, is given by It is easy to see that this works, as the product is equal to zero if and is equal to one if . This isn’t particularly easy to program, and in fact methods such as [...]

by amca01 at January 31, 2012 03:00 AM

January 26, 2012

Ondrej Certik

When double precision is not enough

I was doing some finite element (FE) calculation and I needed the sum of the lowest 7 eigenvalues of a symmetric matrix (that comes from the FE assembly) to converge to at least 1e-8 accuracy (so that I can check calculation done by some other solver of mine, that calculates the same but doesn't use FE). In reality I wanted the rounded value to 8 decimal digits to be correct, so I really needed 1e-9 accuracy (but it's ok if it is let's say 2e-9, but not ok if it is 9e-9). With my FE solver, I couldn't get it to converge more than to roughly 5e-7 no matter how hard I tried. Now what?

When doing the convergence, I take a good mesh and keep increasing "p" (the polynomial order) until it converges. For my particular problem, it is fully converged for about p=25 (the solver supports the order up to 64). Increasing "p" further will not increase the accuracy anymore, and the accuracy stays at the level 5e-7 for the sum of the lowest 7 eigenvalues. For optimal meshes, it converges at p=25, for not optimal meshes, it converges for higher "p", but in all cases, it doesn't get below 5e-7.

I know from experience, that for simpler problems, the FE solver can easily converge to 1e-10 or more using double precision. So I know it is doable, now the question is what the problem is: there
are a few possible reasons:

  • The FE quadrature is not accurate enough
  • The condition number of the matrix is high, thus LAPACK doesn't return very accurate eigenvalues
  • Bug in the assembly/solver (like single/double corruption in Fortran, or some other subtle bug)
When using the same solver for simpler potential, it converged nicely to 1e-10. So this suggests there is no bug in the assembly or solver itself. It is possible that the quadrature is not accurate enough, but again, if it converges for simple problem, it's probably not it. So it seems it is the ill conditioned matrix, that causes this. So I printed the residuals (that I simply calculated in Fortran using the matrix and the eigenvectors returned by LAPACK), and it only showed 1e-9. For simpler problems, it can go to 1e-14 easily. So that must be it. How do we fix it?

Obviously by making the matrix less ill conditioned, which is caused by the mesh for the problem (the ratio of the longest/shortest elements is 1e9) but for my problem I really needed such a mesh. So the other option is to increase the real number accuracy.

In Fortran all real variables are defined as real(dp), where dp is an integer defined at a single place in the project. There are several ways to define it, but it's value is 8 for gfortran and it means double precision. So I increased it to 16 (quadruple precision), recompiled. Now the whole program calculates in quadruple precision (more than 30 significant digits). I had to recompile LAPACK using the "-fdefault-real-8" gfortran option, that promotes all double precision numbers to quadruple precision, and I used the "d" versions (double precision, now promoted to quadruple) of LAPACK routines.

I rerun the calculation ---- and suddenly LAPACK residuals are around 1e-13, and the solver converges to 1e-10 easily (for the sum of the lowest 7 eigenvalues). Problem solved.

Turning my Fortran program to quadruple precision is as easy as changing one variable and recompiling. Turning LAPACK to quadruple precision is easy with a single gfortran flag (LAPACK uses the old f77 syntax for double precision, if it used real(dp), then I would simply change it as for my program). The whole calculation got at least 10x slower with quadruple. The reason is that gfortran runtime uses the libquadmath library, that simulates quadruple precision (as current CPUs only support double precision natively).

I actually discovered a few bugs in my program (typically some constants in older code didn't use the "dp" syntax, but had the double precision hardwired). Fortran warns about all such cases, when the real variables have incompatible precision.

It is amazing how easy it is to work with different precision in Fortran (literally just one change and recompile). How could this be done with C++? This wikipedia page suggests, that "long double" is only 80bit in most cases (quadruple is 128bit), but gcc offers __float128, so it seems I would have to manually change all "double" to "__float128" in the whole C++ program (this could be done with a single "sed" command).

by Ondřej Čertík (noreply@blogger.com) at January 26, 2012 10:51 AM

January 10, 2012

PolyBoRi Blog

C++-ification for upcoming release

Developing PolyBoRi 0.8.1 started right after finishing double releases 0.8 and 0.7.2. Albeit, the next release will contain some new features, the main idea is to stabilize current functionality and to make it more reliable, extendable and sane.

The idea is to have proper C++ code of the specialized Gröbner routines. Therefore, we will learn from the experiences of the integration of PolyBoRi 0.8 in upcoming Sage and Singular.

by Alexander Dreyer of PolyBoRi (noreply@blogger.com) at January 10, 2012 11:56 PM

December 15, 2011

William Stein

How big is the core Sage library?

I just did the following with Sage-4.8.alpha5:
  1. "sudo apt-get install sloccount".
  2. "cp -rv SAGE_ROOT/devel/sage-main /tmp/x"
  3. Use a script [1] to rename all .pyx and .pxi files to .py.
  4. Ran "sloccount *" in the /tmp/x directory, which ignores autogenerated .c/.cpp files coming from Cython.

Here's the result for the full Sage library, which does not distinguish between Python and Cython. Note that sloccount really only counts lines of code -- comments are blank lines are ignored.

Totals grouped by language (dominant language first):
python: 530370 (96.41%)
ansic: 14538 (2.64%)
cpp: 5188 (0.94%)

This suggests that the core Sage library is just over a "half million lines of Python and Cython source code, not counting comments and whitespace".

Here's the breakdown by module:
SLOC    Directory       SLOC-by-Language (Sorted)
88903 rings python=87720,cpp=1183
72913 combinat python=71629,cpp=1284
47747 schemes python=46255,cpp=1492
39815 graphs python=28377,ansic=11438
31540 matrix python=31540
31019 modular python=31012,ansic=7
24475 libs python=21171,ansic=2845,cpp=459
20517 misc python=20383,ansic=134
18006 interfaces python=18006
17577 geometry python=16936,cpp=641
12775 categories python=12775
12093 server python=12093
11971 groups python=11971
11961 plot python=11961
10686 crypto python=10686
9920 modules python=9920
8389 symbolic python=8260,cpp=129
8150 algebras python=8150
7260 ext python=7198,ansic=62
7093 structure python=7093
6364 coding python=6364
5670 functions python=5670
5249 homology python=5249
4798 numerical python=4798
4323 quadratic_forms python=4323
3919 gsl python=3919
3911 calculus python=3911
3879 sandpiles python=3879
3003 sets python=3003
2647 databases python=2647
2074 logic python=2074
1736 finance python=1736
1608 games python=1608
1465 monoids python=1465
1435 tests python=1383,ansic=52
1370 stats python=1370
971 interacts python=971
959 tensor python=959
906 lfunctions python=906
308 parallel python=308
275 probability python=275
219 media python=219
197 top_dir python=197

Here is the script [1]:
#!/usr/bin/env python

import os, shutil

for dirpath, dirnames, filenames in os.walk('.'):
for f in filenames:
if f.endswith('.pyx') or f.endswith('.pxi'):
print f
shutil.move(os.path.join(dirpath, f),
os.path.join(dirpath, os.path.splitext(f)[0] + '.py'))

by William Stein (noreply@blogger.com) at December 15, 2011 02:42 PM

December 13, 2011

William Stein

Using Sage to Support Research Mathematics

When using Sage to support research mathematics, the most important point to make is to strongly encourage people to do the extra work to turn their "scruffy research code" into a patch that can be peer reviewed and included in Sage. They will have to 100% doctest it, and the quality of their code may improve dramatically as a result. Including code in Sage means that the code will continue to work as Sage is updated. Also, the code is peer reviewed and has to have examples and documentation for every function. That's a much higher bar than just "reproducible research".

Moreover, getting code up to snuff to include in Sage will often also reveal mistakes that will avoid embarrassment later. I'm fixing some issues related to a soon-to-be-done paper right now that I found when doing just this for trac 11975.

This final step of turning snippets of research code into a peer-reviewed contribution to Sage is: (1) a surprisingly huge amount of very important useful work, (2) something that is emphasized as an option for Sage more than with Magma or Mathematica or Pari (say), (3) something whose value people have to be sold on, since they get no real extra academic credit for it, at present, usually, and journal referees often don't care either way (I do, but I'm probably in the minority there), and (4) something that a *lot* of research mathematicians do not do. As an example of (4), in the last two months I've seen a ton of (separate!) bodies of code which is all sort of secret research code in various Dropbox repos, and which isn't currently squarely aimed at going into Sage. I've also seen a bunch of code related to Edixhoven et al.'s algorithm for computing Galois representation with a similar property (there is now trac 12132, due to my
urging).

I did *not* do this step yet with this recently accepted paper. Instead I used "scrappy research code" in psage to do the fast L-series computations. The referee for Math Comp didn't care either way, actually... I hope this doesn't come back to haunt me, though there are many double checks here (e.g., BSD) so I'm not too worried. I will do this get-it-in-Sage step at some point though.

This will be better for the community in the long run, and better for individual researcher's credibility too. And there is a lot of value in having a stable refereed snapshot of code on which a published (=very stable) paper is based.

by William Stein (noreply@blogger.com) at December 13, 2011 02:07 PM

November 15, 2011

William Stein

Is the time ripe for http://sagenb.com?

On a Sage mailing list, Karl Crisman wrote: "Regarding the downtime issue [of http://sagenb.org], there have occasionally been rumors of someone or some organization starting a service which would guarantee uptime and provide support."

I might do this. It might be at http://sagenb.com (right now sagenb.com points to sagenb.org). Here's the "business plan". Probably, sagenb.com would appear almost the same as sagenb.org, except there would be Google (?) ads on the side of your worksheets, and the revenue would go toward paying for:
  1. Commercial server hosting:  Some Amazon EC2 instances
  2. An employee (or later, employees) to maintain the servers: at the beginning, this would be me in my free time, since I have a lot of experience with this.
  3. Advertising that sagenb.com exists and we want users!:  Unlike sagenb.org, we would very strongly encourage as many people as possible to use sagenb.com.   The advertising and landing page would explain that though sagenb.com generates money, all that money is all given back to the Sage development community (see below).
  4. Hire employees to improve the notebook: Fix bugs, implement features, etc. There would be a public commitment that all such work would be open sourced and get included with Sage. This would include adding support for a for-pay "pro" subscription version, adding nicer "offline support" (via a Sage install on the user's computer), integrated spreadsheets and better data visualization and manipulation tools for Science and business applications, and enabling development of Sage's core library and patches submission entirely through the notebook, etc.


At some point, there could be a $X/year subscription version that would remove all ads, and increase disk space available to that user. There would also be a $Y/year university site license version with customized authentication (e.g., LDAP?) for each university.   The university site license version might also include Maple/Mathematica/Matlab/Magma pre-installed in their notebook server, assuming appropriate site licenses are honored, so sagenb.com could be something that goes beyond just a platform for "sage as math software".   We can of course also tap into the R market, given that Sage includes R.

I imagine the above being done as a not-for-profit effort, so if it brought in a lot of revenue (e.g., more than needed for hosting and employees), excess money would go to the Sage Foundation to support other Sage development activities. Regarding numbers, according to Google Analytics, right now on average well over 1000 people use sagenb.org every day.    Standard commercial hosting costs for EC2 to support this load would be roughly $100/month. If each visitor generates on average of 1 penny of ad revenue per day (is this a reasonable estimate -- I have no clue?), then we would expect to make $3,650 in one year, which would be enough to fund the EC2 service (at about $2000/year), with a profit of $1,650.  

Now let's dream big!  There might be 1,000,000 potential daily users out there for a service like this, if it worked sufficiently well, since there are many college students (and people that use math and stats programs like R in the sciences, and R is part of Sage) in the world.   Scaling up by a factor of 1,000 would yield over $1 million/year, after paying for hosting fees.   This would be enough to fund substantial work to improve Sage, the notebook, and have a paid Patch Editor position (imagine buying out a top Sage developer professor, e.g., John Palmierri, from 50% of his teaching obligations in exchange for him spending the time he would spend teaching instead organizing that the patches to Sage get properly refereed).  Maybe we could even hire back some of the (many!) people who were Sage developers when they were mathematics grad student or postdocs, but who then went to work at a top web technology company and acquired useful experience there (and are now way too busy to work on Sage).

This has the potential to make Sage a more longterm self-sustaining project. It's probably not possible to get traditional venture capital for a not-for-profit plan like the one above, but fortunately that is not needed due to (1) the generous support the National Science Foundation is currently providing toward development on the Sage notebook, and (2) private donations to the Sage Foundation.  In particular, (2) provides enough money to bootstrap a sagenb.com for a while.

I think the main potential downside is competition.  If somebody else does the same thing right now for profit without giving back their changes, and captures the market it's hard to imagine how the above would work.  Since we don't use the Affero GPL for the Sage notebook, it is legal for somebody to do a lot of customization work to Sage and notebook, create a web-app using this customized version, and give back nothing to the community, so long as they don't redistribute their modified versions publicly. This isn't crazy -- not so long ago, I had a major company (I won't say who out of respect) tell me they planed to do something like that.    And "random people" suggest it somewhat regularly when I give talks about Sage.   I'm surprised it hasn't happened yet, but it definitely hasn't.

So the time to do this is today.  The notebook software we have is now finally reasonably scalable, primarily due to work of Mike Hansen and Rado Kirov.  Our funding situation is good this year.  We have strong good will and interest right now from both the Mathematical Association of America and WebWork developers.   If we wait longer, the one chance to truly make Sage financially self-sustaining in a way that fits with my dreams and values will pass.  

I hope that by making all plans open, and by having a commitment to put the profits back into Sage development, an enterprise like I describe here will be in a better position to attract users than a purely for profit venture by somebody else.

by William Stein (noreply@blogger.com) at November 15, 2011 08:58 AM

September 08, 2011

PolyBoRi Blog

Double release PolyBoRi 0.8.0 and 0.7.2

We are proud to announce that PolyBoRi 0.8 is available at Sourceforge. The development of our 0.8 branch was driven by two major goals: stability and compatibility. Both of them are important to support ongoing attempts of full-fleshed computer algebra systems - for instance by Singular - to integrate PolyBoRi.

First of all, PolyBoRi 0.8 corrects some critical design issues. The most important change was, that we said good bye to the active ring. While each polynomial object already had a reference to its ring, default constructors of those objects used the active ring for initialization. This frequently had lead to inconsistencies which we can avoid now. For you, this means that polynomial (and likewise) objects are not default-constructible any more. Since they need a valid ring reference on construction you have to provide the latter as constructor argument.

Second we improved the scons-based build system. Among classical Linux, we officially support platforms like Cygwin, Apple's Darwin, Gentoo and  Gentoo-Prefix (even on Darwin). PolyBoRi is also prepared for generating rpm- and deb-based packages.

Finally, the compatibilty with recent release of upstream libraries like libm4ri was re-established.

We also wanted to make our improved build system and various compatibility and bug fixes available to ongoing projects, that are based on PolyBoRi 0.7.x. Hence, we backported those improvements to the 0.7 branch. The result was PolyBoRi 0.7.2 making today a double release day.

The full changelog of release 0.8.0 reads as follows:
  • improved standard-conformity, multiprocessing safety, and usability
  • renamed subdirectory polybori to libpolybori
  • new directory structure in libpolybori/include and groebner/include
  • libgroebner renamed libpolybori_groebner 
  • using only part of CUDD (linked into libpolybori)
  • officially introducing VariableBlock
  • removed all global settings and BooleEnv itself
  • added factories and RingContext to replace active/global ring
  • improved multiprocessing and pickling
  • added inSingleBlock/in_single_block to polynomials
  • for avoiding name clashes Cudd's headers install into include/polybori/cudd 
  • Introducing WeakRingPtr for managing weak references to rings on C++ level
  • python level: deprecated Ring changing routines change_ordering, set_variable_names, append_block removed, use extended arguments of Ring.__init__() and Ring.clone().
  • ipython 0.11 compatibility established (hint from the Fedora team. Thanks!)
  • libM4RI compatibility re-established (hint from Martin Albrecht. Thanks!)
  • distributed targed uses world-readable permissions
  • cluster.py: very experimental Python module for finding an overdetermined subset of equations
Binary packages are available for recent OpenSuSE, Mandriva and Fedora as well as Debian and Ubuntu distributions. Just point your package manager here.  In addition the corresponding communities bundled preliminary spkgs and Gentoo ebuild scripts.

In the next weeks, small tutorials and will follow at this place and at Singular Faces.

My best,
  Alexander

by Alexander Dreyer of PolyBoRi (noreply@blogger.com) at September 08, 2011 11:35 AM

August 31, 2011

William Stein

Sage: Creating a Viable Free Open Source Alternative to Magma, Maple, Mathematica, and MATLAB


The goal of the Sage project is to create a viable fre open source alternative to Magma, Maple(TM), Mathematica(R), and MATLAB(R), which are the most popular non-free closed source mathematical software systems. (Maple is a trademark of Waterloo Maple Inc. Mathematica is a registered trademark of Wolfram Research Incorporated. MATLAB is a registered trademark of MathWorks. I will refer to the four systems together as ``Ma'' in the rest of this article.) Magma is (by far) the most advanced non-free system for structured abstract algebraic computation, Mathematica and Maple are popular and highly developed systems that shine at symbolic manipulation, and MATLAB is the most popular system for applied numerical mathematics. Together there are over 3,000 employees working at the companies that produce the four Ma's listed above, which take in over a hundred million dollars of revenue annually.

By a viable free alternative to the Ma's, we mean a system that will have the important mathematical features of each Ma, with comparable speed. It will have 2d and 3d graphics, an interactive notebook-based graphical user interface, and documentation, including books, papers, school and college curriculum materials, etc. A single alternative to all of the Ma's is not necessarily a drop-in replacement for any of the Ma's; in particular, it need not run programs written in the custom languages of those systems. Thus it need not be like Octave or R, which (nearly) clone the languages of MATLAB and S, respectively. Development would instead focus on implementing functions that users demand, rather than systematically trying to implement every single function of the Ma's. The culture, architecture, and general look and feel of such a system would be very different than that of the Ma's.



Motivation for Starting Sage


Each of the Ma's cost substantial money, and is hence expensive for me, my collaborators, and students. The Ma's are not owned by the community like Sage is, or Wikipedia is, for that matter.


The Ma's are closed, which means that the implementation of some algorithms are secret, in which case you are not allowed to modify or extend them.

The Mathematica Documentation: "You should realize at the outset that while knowing about the internals of Mathematica may be of intellectual interest, it is usually much less important in practice than you might at first suppose. Indeed, in almost all practical uses of Mathematica, issues about how Mathematica works inside turn out to be largely irrelevant. Particularly in more advanced applications of Mathematica, it may sometimes seem worthwhile to try to analyze internal algorithms in order to predict which way of doing a given computation will be the most efficient. [...] But most often the analyses will not be worthwhile. For the internals of Mathematica are quite complicated.."

The philosophy espoused in Sage, and indeed by the vast open source software community, is exactly the opposite. We want you to know about the internals, and when they are quite complicated, we want you to help make them more understandable. Indeed, Sage's growth depends on you analyzing how Sage works, improving it, and contributing your improvements back.
sage: crt(2, 1, 3, 5)  # Chinese Remainder Theorem

11
sage: crt? # ? = documentation and examples
Returns a solution to a Chinese Remainder Theorem...
...
sage: crt?? # ?? = source code
def crt(...):
...
g, alpha, beta = XGCD(m, n)
q, r = (b - a).quo_rem(g)
if r != 0:
raise ValueError("No solution ...")
return (a + q*alpha*m) % lcm(m, n)
Moreover, by browsing the Mercurial repository, you can see exactly who wrote or modified any particular line of code in the Sage library, when they did it, and why. Everything included in Sage is free and open source, and it will foreover remain that way.

Linus Torvalds: "I see open source as Science. If you don't spread your ideas in the open, if you don't allow other people to look at how your ideas work and verify that they work, you are not doing Science, you are doing Witchcraft. Traditional software development models, where you keep things inside a company and hide what you are doing, are basically Witchcraft. Open source is all about the fact that it is open; people can actually look at what you are doing, and they can improve it, and they can build on top of it. [...] One of my favorite quotes from history is Newton: `If I had seen further, it has been by standing on the shoulders of giants.'"

The design decisions of the Ma's are not made openly by the community. In contrast, important decisions about Sage development are made via open public discussions and voting that is archived on public mailing lists with thousands of subscribers.

Every one of the Ma's uses a special mathematics-oriented interpreted programming language, which locks you into their product, makes writing some code outside mathematics unnecessarily difficult, and impacts the number of software engineers that are experts at programming in that language. In contrast, the user language of Sage is primarily the mainstream free open source language Python, which is one of the world's most popular interpreted programming languages. The Sage project neither invented nor maintains the underlying Python language, but gains immediate access to the IPython shell, Python scientific libraries (such as NumPy, SciPy, CVXopt and MatPlotLib), and a large Python community with major support from big companies such as Google. In comparison to Python, the Ma's are small players in terms of language development. Thus for Sage most of the problems of language development are handled by someone else.

The bug tracking done for three of four of the Ma's is currently secret (MATLAB has an open bug tracker, though it requires free registration to view.), which means that there is no published accounting of all known bugs, the status of work on them, and how bugs are resolved. But the Ma's do have many bugs; see the release notes of each new version, which lists bugs that were fixed. Sage also has bugs, which are all publicly tracked at the trac server, and there are numerous ``Bug Days'' workshops devoted entirely to fixing bugs in Sage. Moreover, all discussion about resolving a given bug, including peer review of solutions, is publicly archived. We note that sadly even some prize winning free open source systems, such as GAP, do not have an open bug tracking system, resulting in people reporting the same bugs over and over again.

Each of the Ma's is a combination of secret unchangeable compiled code and less secret interpreted code. Users with experience programming in compiled languages such as Fortran or C++ may find the loss of a compiler to be frustrating. None of the Ma's has an optimizing compiler that converts programs written in their custom interpreted language to a fast executable binary format that is not interpreted at runtime. (MATLAB has a compiler, but ``the source code is still interpreted at run-time, and performance of code should be the same whether run in standalone mode or in MATLAB.'' Mathematica also has a Compile function, but simply compiles expressions to a different internal format that is interpreted, much like Sage's fast_callable function.) In contrast, Sage is tightly integrated with Cython. The Cython project has received extensive contributions from Sage developers, and is very popular in the world of Python-based scientific computing. Cython is a Python-to-C/C++ compiler that speeds up code execution and has support for statically declaring data types (for potentially enormous speedups) and natively calling existing C/C++/Fortran code. For example, enter the following in a cell of the Sage notebook:
def python_sum2(n):

s = int(0)
for i in xrange(1, n+1):
s += i*i
return s
Then enter the following in another cell:
%cython

def cython_sum2(long n):
cdef long i, s = 0
for i in range(1, n+1):
s += i*i
return s
The second implementation, despite looking nearly identical, is nearly a hundred times faster than the first one (your timings may vary).
sage: timeit('python_sum2(2*10^6)')

5 loops, best of 3: 154 ms per loop
sage: timeit('cython_sum2(2*10^6)')
125 loops, best of 3: 1.76 ms per loop
sage: 154/1.76
87.5

Of course, it is better to choose a different algorithm. In case you don't remember a closed form expression for the sum of the first $n$ squares, Sage can deduce it:
sage: var('k, n')

sage: factor(sum(k^2, k, 1, n))
1/6*(n + 1)*(2*n + 1)*n
And now our simpler fast implementation is:
def sum2(n):

return n*(2*n+1)*(n+1)/6
Just as above, we can also use the Cython compiler:
%cython

def c_sum2(long n):
return n*(2*n+1)*(n+1)/6
Comparing times, we see that Cython is 10 times faster:
sage: n = 2*10^6

sage: timeit('sum2(n)')
625 loops, best of 3: 1.41 microseconds per loop
sage: timeit('c_sum2(n)')
625 loops, best of 3: 0.145 microseconds per loop
sage: 1.41/.145
9.72413793103448
In this case, the enhanced speed comes at a cost, in that the answer is wrong when the input is large enough to cause an overflow:
sage: c_sum2(2*10^6)   # WARNING: overflow

-407788678951258603
Cython is very powerful, but to fully benefit from it, one must understand machine level arithmetic data types, such as long, int, float, etc. With Sage you have that option.

What is Sage?


The goal of Sage is to compete with the Ma's, and the intellectual property at our disposal is the complete range of GPL-compatibly licensed open source software.

Sage is a self-contained free open source distribution of about 100 open source software packages and libraries that aims to address all computational areas of pure and applied mathematics. See the list of packages in Sage, which includes R, Pari, Singular, GAP, Maxima, GSL, Numpy, Scipy, ATLAS, Matplotlib, and many other popular programs. The download of Sage contains all dependencies required for the normal functioning of Sage, including Python itself. Sage includes a substantial amount of code that provides a unified Python-based interface to these other packages. Sage also includes a library of new code written in Python, Cython and C/C++, which implements a huge range of algorithms.



History


I made the first release of Sage in February 2005, and at the time called it "Software for Arithmetic Geometry Experimentation." I was a serious user of, and contributor to, Magma at the time, and was motivated to start Sage for many of the reasons discussed above. In particular, I was personally frustrated with the top-down closed development model of Magma, the fact that several million lines of the source code of Magma are closed source, and the fees that my colleagues had to pay in order to use the substantial amount of code that I contributed to Magma. Despite my early naive hope that Magma would be open sourced, it never was. So I started Sage motivated by the dream that someday the single most important item of software I use on a daily basis would be free and open. David Joyner, David Kohel, Joe Wetherell, and Martin Albrecht were also involved in the development of Sage during the first year.

In February 2006, the National Science Foundation funded a 2-day workshop called ``Sage Days 2006'' at UC San Diego, which had about 40 participants and speakers from several open and closed source mathematical software projects. After doing a year of fulltime mostly solitary work on Sage, I was surprised by the positive reception of Sage by members of the mathematical research community. What Sage promised was something many mathematicians wanted. Whether or not Sage would someday deliver on that promise was (and for many still is) an open question.

I had decided when I started Sage that I would make it powerful enough for my research, with or without the help of anybody else, and was pleasantly surprised at this workshop to find that many other people were interested in helping, and understood the shortcomings of existing open source software, such as GAP and PARI, and the longterm need to move beyond Magma. Six months later, I ran another Sage Days workshop, which resulted in numerous talented young graduate students, including David Harvey, David Roe, Robert Bradshaw, and Robert Miller, getting involved in Sage development. I used startup money from University of Washington to hire Alex Clemesha as a fulltime employee to implement 2d graphics and help create a notebook interface to Sage. I also learned that there was much broader interest in such a system, and stopped referring to Sage as being exclusively for ``arithmetic geometry''; instead, Sage became ``Software for Algebra and Geometry Experimentation.'' Today the acronym is deprecated.

The year 2007 was a major turning point for Sage. Far more people got involved with development, we had four Sage Days workshops, and prompted by Craig Citro, we instituted a requirement that all new code must have tests for 100% of the functions touched by that code, and every modification to Sage must be peer reviewed. Our peer review process is much more open than in mathematical research journals; everything that happens is publicly archived on trac. During 2007, I also secured some funding for Sage development from Microsoft Research, Google, and NSF. Also, a German graduate student studying cryptography, Martin Albrecht presented Sage at the Troph'ees du Libre competition in France, and Sage won first place in ``Scientific Software'', which led to a huge amount of good publicity, including articles in many languages around the world and appearances, for example, this Slashdot article.

In 2008, I organized 7 Sage Days workshops at places such as IPAM (at UCLA) and the Clay Mathematics Institute, and for the first time, several people besides me made releases of Sage. In 2009, we had 8 more Sage Days workshops, and the underlying foundations of Sage improved, including development of a powerful coercion architecture. This coercion model systematically determines what happens when performing operations such as a + b, when a and b are elements of potentially different rings (or groups, or modules, etc.).
sage: R. = PolynomialRing(ZZ)

sage: f = x + 1/2; f
x + 1/2
sage: parent(f)
Univariate Polynomial Ring in x over Rational Field
We compare this with Magma (V2.17-4), which has a more ad hoc coercion system:
> R := PolynomialRing(IntegerRing());

> x + 1/2
^
Runtime error in '+': Bad argument types
Argument types given: RngUPolElt[RngInt], FldRatElt

Robert Bradshaw and I also added support for beautiful browser-based 3D graphics to Sage, which involved writing a 3D graphics library, and adapting the free open source JMOL Java library for rendering molecules to instead plot mathematical objects.

sage: f(x,y) = sin(x - y) * y * cos(x)

sage: plot3d(f, (x,-3,3), (y,-3,3), color='red')

In 2009, following a huge amount of porting work by Mike Hansen, development of algebraic combinatorics in Sage picked up substantial momentum, with the switch of the entire MuPAD-combinat group to Sage (forming sage-combinat), only months before the formerly free system MuPAD{\textregistered}\footnote{MuPAD is a registered trademark of SciFace Software GmbH \& Co.} was bought out by Mathworks (makers of MATLAB). In addition to work on Lie theory by Dan Bump, this also led to a massive amount of work on a category theoretic framework for Sage by Nicolas Thiery.

In 2010, there were 13 Sage Days workshops in many parts of the world, and grant funding for Sage significantly improved, including new NSF funding for undergraduate curriculum development. I also spent much of my programming time during 2010--2011 developing a number theory library called psage, which is currently not included in Sage, but can be easily installed.

Many aspects of Sage make it an ideal tool for teaching mathematics, so there's a steadily growing group of teachers using it: for example, there have been MAA PREP workshops on Sage for the last two years, and a third is likely to run next summer, there are regular posts on the Sage lists about setting up classroom servers, and there is an NSF-funded project called UTMOST devoted to creating undergraduate curriculum materials for Sage.

The publications page lists 101 accepted publications that use Sage, 47 preprints, 22 theses, and 16 books, and there are surely many more ``in the wild'' that we are not aware of. According to Google Analytics, the main Sage website gets about 2,500 absolute unique visitors per day, and the website http://sagenb.org, which allows anybody to easily use Sage through their web browser, has around 700 absolute unique visitors per day.

For many mathematicians and students, Sage is today the mature, open source, and free foundation on which they can build their research program.

by William Stein (noreply@blogger.com) at August 31, 2011 11:06 AM

July 29, 2011

Marshall Hampton

Review of "Sage: Beginner's Guide" by Craig Finch

I was asked by the publisher to review "Sage: Beginner's Guide" by Craig Finch. They sent me a free copy; I have no other conflicts of interest. I am generally biased towards Sage itself, as an avid user and minor developer.

On Amazon you can browse the table of contents, which gives a pretty good idea of the strengths of the book, namely basic computation and plotting, numerical calculations, and data analysis. The focus was an excellent choice considering what is already available. The current free Sage Tutorial is oriented much more towards pure mathematicians. There is a Numerical Computing With Sage as part of the standard documentation, but at the moment its quite short and nowhere near as helpful as Finch's book.

I liked the style of the book a lot. There are many code examples that illustrate how to accomplish concrete tasks, along with good explanations of what they are doing. Many of these are things that are unfortunately far from obvious to a beginner (or even intermediate) Sage user. Despite using Sage heavily for the last five years, I learned some new things. The book is particularly strong in showing how to use Numpy, Scipy, and Matplotlib. Sage wraps a lot of the functionality of these projects, but if you want to do something that isn't included in the standard interfaces it can be quite mystifying.

Chapter 9, "Learning Advanced Python Programming", might have been a little ambitious. There's nothing wrong with it, but its too short to provide enough. Fortunately there are a lot of good books, some of them free, that cover Python programming in much more depth. I would have preferred some of this space and effort to be devoted to using Cython and the @interact command, which are covered very briefly in Chapter 10.

I teach several classes using Sage and I will definitely advertise this text as a useful optional supplement (I consider it a little too expensive to add on as a mandatory second text). It would be nice if some institutions considered using Sage instead of its commercial competitors such as Maple, Matlab, and Mathematica - you could probably give every student a copy of this book for the money saved from license fees!

The only thing I disliked about the book was the quality of the illustrations. Sage output that was in LaTeX was not typeset, but instead looks as if a PNG was copied from a screenshot. Some of the examples would have benefited from being in color. The quality of the plots is also somewhat poor. This is not too big a deal if one is following along with Sage, since you can reproduce the figures. None of them are bad enough to obscure the content.

Overall this is a very impressive and useful introduction to Sage that should help any beginning user a great deal.

by M. Hampton (noreply@blogger.com) at July 29, 2011 10:09 AM

July 18, 2011

Minh Van Nguyen

2011 Spies Prize: Robert Bradshaw

This year, Robert Bradshaw is the winner of the Annual Spies Sage Development Prize. Congratulations, Robert! Here is the prize citation: Robert Bradshaw has been an extremely active and productive Sage developer for over five years. Additionally, he has been a leader, both in maintaining the community and in important design decisions. He is probably [...]

by mvngu at July 18, 2011 08:42 AM

June 08, 2011

Fredrik Johansson

Some FLINT 2.2 highlights

Version 2.2 of FLINT (Fast Library for Number Theory) was released last weekend. Some updated benchmarks are available.

In this blog post, I'm going to talk a bit about features I contributed in this version. With apologies to Sebastian Pancratz who wrote a whole lot of code as well -- in particular, a new module for computing with p-adic numbers, and a module for rational functions! Bill Hart also implemented a faster polynomial GCD, which is a quite important update since GCD is crucial for most polynomial business. Anyhow...

Polynomial interpolation


I've added various functions for polynomial interpolation to the fmpz_poly module. In general, these can be used to speed up various computations involving integer or rational polynomials by mapping a given problem to (Z/nZ)[x], Z or even Z/nZ, taking advantage of fast arithmetic in those rings, and then via interpolation recovering a result in Z[x] or Q[x].

Firstly, there are some new Chinese Remainder Theorem functions for integer polynomials, allowing you to reconstruct an integer polynomial from a bunch of polynomials with coefficients modulo different primes. Straightforward code (the actual work is done by functions in the fmpz module), but useful to have. The CRT functions are used by the new modular GCD code.

There are also functions for evaluating an integer polynomial on a set of points, and forming the interpolating polynomial (generally with rational coefficients) given a set of points and the values at those points.

Finally, user-friendly functions for evaluation and interpolation at a power of two (Kronecker segmentation) have been added. The code for this is actually a very old part of FLINT, and possibly some of the most complicated code in the library (packing bits efficiently is surprisingly hard). The new functions just wrap this functionality, but take care of memory management and various special cases, so you can now just safely do something like:


fmpz_t z;
fmpz_init(z);
long bits = fmpz_poly_max_bits(poly) + 1; /* +1 for signs */
fmpz_poly_bit_pack(z, poly, bits);
fmpz_poly_bit_unpack(poly, z, bits); /* recover poly */
fmpz_clear(z);


Apart from Kronecker segmentation, these functions are not currently asymptotically fast. Fast multi-modulus CRT for coefficient reconstruction is probably not all that important in most circumstances, because it's more common to use evaluation-interpolation techniques for polynomials of large degree and small coefficients than the other way around. Nonetheless, polynomials with large coefficients do arise as well. For example, the vector Bernoulli number code in FLINT relies on fast CRT, and currently uses custom code for this.

Polynomial interpolation uses Lagrange interpolation with barycentric weights, with a few tricks to avoid fractions. This is all implemented using an O(n^2) algorithm, but the actual time complexity is higher due to the fact that the coefficients when working over integers usually will be large, around n! in magnitude.

Here are some timing examples, evaluating and recovering a length-n polynomial with +/-1 coefficients basically as follows:

x = _fmpz_vec_init(n);
y = _fmpz_vec_init(n);
fmpz_poly_init(P);
fmpz_poly_init(Q);

for (i = 0; i n; i++)
x[i] = -n/2 + i;

fmpz_poly_randtest(P, state, n, 1);

fmpz_poly_evaluate_fmpz_vec(y, P, x, n);
fmpz_poly_interpolate_fmpz_vec(Q, x, y, n);


The bits column below measures the largest value in y, which grows quite large despite the input polynomial having small coefficients:


n=8 eval=762 ns interp=13 us bits=8 ok=1
n=16 eval=3662 ns interp=61 us bits=42 ok=1
n=32 eval=29 us interp=673 us bits=-113 ok=1
n=64 eval=136 us interp=4951 us bits=-316 ok=1
n=128 eval=625 us interp=45 ms bits=-762 ok=1
n=256 eval=2500 us interp=792 ms bits=-1779 ok=1
n=512 eval=12 ms interp=10 s bits=-4089 ok=1


As you can see, the interpolation speed is not too bad for small n, but eventually grows out of control. How to do better?

Naive Lagrange interpolation is not optimal: it is possible to do n-point evaluation and interpolation in essentially O(n log2 n) operations. Such algorithms do not necessarily lead to an improvement over the integers (you still have to deal with coefficient explosion), but they should win over finite fields. So the right solution will perhaps be to add polynomial evaluation/interpolation functions based on modular arithmetic.

Rational numbers and matrices



A new module fmpq is provided for computing with arbitrary-precision rational numbers. For the user, the fmpq_t type essentially behaves identically to the MPIR mpq_t type. However, an fmpq_t only takes up two words of memory when the numerator and denominator are small (less than 262), whereas an mpq_t always requires six words plus additional heap-allocated space for the actual number data.

The fmpq functions are a bit faster than mpq functions in many cases when the numerator and/or denominator is small. But the main improvement should come for vectors, matrices or polynomials of rational numbers, due to the significantly reduced memory usage and memory management overhead (especially when many entries are zero or integer-valued).

Some higher-level functionality is also provided in the fmpq module, e.g. for rational reconstruction. The functions for computing special rational numbers (like Bernoulli numbers) have also been switched over to the fmpq type. Another supported feature is enumeration of the rationals (using the Calkin-Wilf sequence or by height). Generating the 100 million "first" positive rational numbers takes 9.6 seconds done in order of height, or 2.6 seconds in Calkin-Wilf order.

FLINT actually does not use fmpq's to represent polynomials over Q (fmpq_poly), and probably never will. The fmpq_poly module represents a polynomial over Q as an integer polynomial with a single common denominator, which is usually faster. The reason for adding the fmpq_t type is that it enabled developing the new fmpq_mat module, which implements dense matrices of rational numbers. For matrices, a common-denominator representation would be less convenient and in many cases completely impractical.

The new FLINT fmpq_mat module is very fast, or at least very non-slow. It is easy to find examples where it does a simple computation a thousand times faster than the rational matrices in Sage.

There's not actually much code in the fmpq_mat module itself; it does almost all "level 3" linear algebra (computations requiring matrix multiplication or Gaussian elimination) by clearing denominators and computing over the integers. This approach is in fact stolen shamelessly from Sage, but the functions in Sage are highly unoptimized in many cases. The code in Sage still wins for many sufficiently large problems as it has asymptotically fast algorithms for many things we do not (like computing null spaces). See the benchmarks page for more details.

I should not forget to mention that I've implemented Dixon's p-adic algorithm for solving Ax = b for nonsingular square A. (I wish I had a good link for Dixon's algorithm here, but sadly it doesn't appear to be described conveniently anywhere on the web. The original paper is "Exact solution of linear equations using P-adic expansions", if you have the means to get through the Springer paywall.)

This is now used both for solving over both Z and Q. The solver in FLINT is competitive with Sage (which uses IML+ATLAS) up to systems of dimension somewhere between perhaps 100 and 1000 (depending greatly on the size of the entries in the inputs and in the solution!). There's much to do here -- we should eventually have BLAS support in FLINT, which will speed up core matrix arithmetic, but there's room for a lot of algorithmic tuning as well.

There are some other minor new matrix features as well... they can be found in the changelog.

Polynomial matrices



A new module (fmpz_poly_mat) is provided for dense matrices over Z[x], i.e. matrices whose entries are polynomials with integer coefficients. The available functionality includes matrix multiplication, row reduction, and determinants. Matrix multiplication is particularly fast, as it uses the Kronecker segmentation interpolation/evaluation technique described above. (A similar algorithm is provided for determinants, but it's not really optimal as this point.)

The benchmarks page has detailed some detailed timings, so I won't repeat them here -- but generally speaking, the FLINT implementation is an order of magnitude faster than Sage or Magma for matrices of manageable size.

There's much more to be done for polynomial matrices. Row reduction is implemented quite efficiently, but it's too slow as an algorithm for many tasks such as computing null spaces of very large matrices. A future goal is to implement asymptotically fast algorithms (see the paper on x-adic lifting by Burçin Eröcal and Arne Storjohann for example).

by Fredrik Johansson (noreply@blogger.com) at June 08, 2011 07:41 PM