This is an announcement for the one-week conference organised by OpenDreamKit at the CIRM premises in Marseille, France. Webpage on the CIRM website for registrations
This is an announcement for the one-week conference organised by OpenDreamKit at the CIRM premises in Marseille, France. Webpage on the CIRM website for registrations
This is an announcement for a research software engineer position opening at Université Paris-Sud, working on web-based user interfaces and semantic interoperability layers for mathematical computational systems and databases.
Interviews in early Spring 2018 for a recruitment as soon as possible.
For a full-time position, and depending on the applicant’s past experience, between 2000€ and 3000€ of monthly “salaire net” (salary after non-wage labour cost but before income tax). Equivalently, what this salary represents for is a “salaire brut” of up to 46200€ yearly.
At this stage, we have secured funding for at least 9 months of full-time salary. We hope to be able to extend this up to the end of the project (August 2019). Part time positions are negotiable.
The research software engineer will work at the Laboratoire de Recherche en Informatique of Université Paris Sud, in the Orsay-Bures-Gif-Saclay campus, 25 km South-West of Paris city centre.
Paris Sud is the leading site of OpenDreamKit, with eight participants involved in all the work packages. The research software engineer will join that team and support its efforts in WP4 and WP6, targeting respectively Jupyter-based user interfaces and interoperability for mathematical computational systems and databases. A common theme is how to best exploit the mathematical knowledge embedded in the systems. For some context, see e.g. the recent publications describing the Math-In-The-Middle approach.
More specifically, a successful candidate will be expected to contribute significantly to some of the following tasks (see also OpenDreamKit’s Proposal:
Dynamic documentation and exploration system (Task 4.5)
Introspection has become a critical tool in interactive computation, allowing user to explore, on the fly, the properties and capabilities of the objects under manipulation. This challenge becomes particularly acute in systems like Sage where large parts of the class hierarchy is built dynamically, and static documentation builders like Sphinx cannot anymore render all the available information.
In this task, we will investigate how to further enhance the user experience. This will include:
On the fly generation of Javadoc style documentation, through introspection, allowing e.g. the exploration of the class hierarchy, available methods, etc.
Widgets based on the HTML5 and web component standards to display graphical views of the results of SPARQL queries, as well as populating data structures with the results of such queries,
D4.16: Exploratory support for semantic-aware interactive Jupyter widgets providing views on objects of the underlying computational or database components. Preliminary steps are demonstrated in the Larch Environment project (see demo videos) and sage-explorer. The ultimate aim would be to automatically generate LMFDB-style interfaces.
Whenever possible, those features will be implemented generically for any computation kernel by extending the Jupyter protocol with introspection and documentation queries.
Memoisation and production of new data (Task 6.9)
Many CAS users run large and intensive computations, for which they
want to collect the results while simultaneously working on software
improvements. GAP retains computed attribute values of objects
within a session; Sage currently has a limited cached_method
.
Neither offers storage that is persistent across sessions or
supports publication of the result or sharing within a
collaboration. We will use, extend and contribute back to, an
appropriate established persistent memoisation infrastructure, such
as python-joblib
, redis-simple-cache
or dogpile.cache
, adding
features needed for storage and use of results in mathematical
research. We will design something that is simple to deploy and
configure, and makes it easy to share results in a controlled
manner, but provides enough assurance to enable the user to rely on
the data, give proper credit to the original computation and rerun
the computation if they want to.
Knowledge-based code infrastructure (Task 6.5)
Over the last decades, computational components, and in particular Axiom, MuPAD, \GAP, or \Sage, have embedded more and more mathematical knowledge directly inside the code, as a way to better structure it for expressiveness, flexibility, composability, documentation, and robustness. In this task we will review the various approaches taken in these software (e.g. categories and dynamic class hierarchies) and in proof assistants like Coq (e.g. static type systems), and compare their respective strength and weaknesses on concrete case studies. We will also explore whether paradigms offered by recent programming languages like Julia or Scala could enable a better implementation. Based on this we will suggest and experiment with design improvements, and explore challenges such as the compilation, verification, or interoperability of such code.
Degree in mathematics or computer science; PhD appreciated but not required;
Strong programming experience with languages such as Python, Scala, Javascript, etc; experience with web technologies in general and the Jupyter stack in particular appreciated;
Experience in software design and practical implementation in large software projects; experience with computational mathematics software (e.g. SageMath) appreciated;
Experience in open-source development (collaborative development tools, interaction with the community, …);
Strong communication skills;
Fluency in oral and written English; speaking French is not a prerequisite.
The position will be funded by OpenDreamKit, a Horizon 2020 European Research Infrastructure project that will run for four years, starting from September
Within this ecosystem, the applicant will work primarily on the free open-source mathematics software system Sagemath. Based on the Python language and many existing open-source math libraries, SageMath is developed since 10 years by a worldwide community of 300 researchers, teachers and engineers, and has reached 1.5M lines of code.
The applicant will work within one of the largest teams of SageMath developers, composed essentially of researchers in mathematics and computer science, at the Laboratoire de Recherche en Informatique (LRI) and in nearby institutions. The LRI also hosts a strong team working on proof systems.
To apply for this position, please send an e-mail to upsud-recruitement-research-engineer at opendreamkit.org by March 10, with the following documents (in English) attached:
cover_letter.pdf: a cover letter explaining your interest in this particular position;
CV.pdf: a CV, highlighting among other things your skills and background and your contributions to open source software;
degree.pdf: copy of your most recent degree including (if applicable) the reviewers reports;
reference letters: files reference_letter_.pdf or contact information of potential referees.
Applications sent after March 10 will be considered until the position is filled.
As you might know, SageMath is a software system for mathematical computation. Built on Python, it has extensive libraries for numerous areas of mathematics. One of these areas is Matroid Theory, as has been exhibited several times on this blog.
Google Summer of Code is a program where Google pays students to work on open-source software during the summer.
Once again, SageMath has been selected as a mentoring organization for the Google Summer of Code. We’ve had students work on aspects of the Matroid Theory functionality for the past four years. Maybe this year, YOU can join those illustrious ranks! Check out the call for proposals and ideas list. Read the instructions on both pages carefully. Applications open on March 12, so it’s a good idea to start talking to potential mentors and begin writing your proposal!
This is an online project meeting to review all achievements since March 2017.
Viviane Pons gave a two hours lecture on mathematical experimentation, research and open-source mathematical software development for a seminar organized by the students of the prestigious school Ecole Normale supérieure de Lyon.
by William Stein ([email protected]) at January 01, 2018 10:29 PM
During the last week of November two events related to the European Open Science Cloud (EOSC) took place in Brussels: the EOSC stakeholder forum on 28-29 November and the 2017 edition of the DI4R (Digital Infrastructures for Research). These two events were closely related in regard with the momentum period 2018-2020 for the Scientific Community and the advancement of open and digital science.
The European Open Science Cloud (EOSC) is currently a process to build a digital platform that is inspired by the F.A.I.R. principle. F.A.I.R. stands for: Findable, Accessible, Interoperable and Reusable.
FAIR is the idea between the recent actions taken by the European Commission and various funding agencies in favour of open data and open science. The EOSC platform, also called the EOSC-Hub, intends in fine to make possible to the whole European scientific Community and beyond: exchange of data, easy access to knowledge and access to all useful infrastructures for all scientific disciplines.
The word “Open” in EOSC must be understood in every possible way. It means (not exhaustively) that the platform will :
Technically speaking, the general idea is to federate existing and yet to appear services, and to make these services interoperable with one another. Services and software can concern for example data and computation hosting, authentication, indexing, collaborative tools, data and service catalogues, etc. Thanks to the interoperability the researchers will be able to discover, navigate, use and re-use, data and combine them suing a various number of infrastructures. The interpretation of data is (or at last should be in our reading) quite broad here: it includes metadata and provenance, data models, tools and generally knowledge required to make sense of the data.
The core role of EOSC-Hub is to coordinate, foster interoperability, develop glue services, and generally speaking steer the efforts toward the needs of researchers. The outcome of the process is still unknown but the ambition is grand. Indeed, an EU official compared the EOSC to the internet. The WWW has been and still is a process, and it has changed the face of the Earth. The ambition of the EOSC is to change the way we do Science. An explanatory video is available here.
The EOSC will eventually be linked to the future pan-European HPC, as well as with the future European Data Infrastructure (EDI) which are funded alongsie the EOSC by the European Commission. The EDI and the EOSC may at the end of the day merge, since differences of purpose for each is not yet fully clear.
The EOSC is part of two strategies of the European Commission: the Digital Single Market on one hand and the European Research Area on the other hand.
The EOSC is the combination of the two, as it aims at creating a single research community without country or technical barriers (interoperability between software and services).
In its Work Programme 2018-2020 for European Research Infrastructures, the European Commission has put 375 million Euros on the table for “implementing the European Open Science Cloud”. It was initially planned to open a topic of 79M€ for adding other services and infrastructures to the EOSC, but it was postponed after 2020 for the 9th Framework Programme for research and Innovation. According to Augusto Burgueño Arjon, Head of the Unit “eInfrastructure & Science Cloud” at the DG Communication and Networks at the European Commission, it is not yet decided what this fund will be used for as it will depend on the evolution of the EOSC-Hub. AS according to the Commission and main supporters there is no room for failure, the next 3 years will be crucial. If EOSC implementation is a success, the first half of FP9 will focus on aggregating remaining infrastructures and services while the second half of FP9 will focus on sustaining and scaling the EOSC.
Member States are pushing forward the EOSC as well. Germany and the Netherlands issued in May 2017 a joint position paper on the EOSC, which France officially joined on the 01/12/2017. Furthermore 13 Member States have of to date signed an agreement to start a pan-European HPC programme. This new infrastructure will follow Prace and will be closely connected to EOSC.
Outside of public institutions there are 6 major players pushing forward the EOSC:
Together they built a consortium of 74 partners coordinated by EGI, in order to answer the call H2020-EINFRA-12-2017(a) within the Work Programme 2018-2020. This project named “EOSC-Hub project” will receive 30M€ by the Commission for the 2018-2020 period. Beneficiaries include Research Infrastructures, national e-Infrastructure providers, SMEs and academic institutions. This consortium will be focused on and addressing issues of the EOSC such as interoperability, adoption of open standards and protocols, governance structure etc
Indeed, our impression is that - while the vision is clear - the design and realisation of this vision has no clear shape yet: many ideas and components are floating around, some fully or half realised but simultaneously many questions remain unanswered or have not even been asked yet. This state of affairs is not surprising: the vision is grand and has the potential to disrupt (positively) the way in which research is carried out today; the actors are human beings, scientists, institutes, funding bodies and states all with their own priorities and constraints. Furthermore, there are real technical challenges in putting this “Cloud” together, and there are also cultural challenges to move more and more research activities towards Open Science. Existing metrics for academics and research institutions do not generally incentivise open science; which makes change of behaviour difficult. The biggest challenge for the EOSC is then maybe the challenge of skills and habit because as it was heard from an EU official: “if we do all this and no one is using it, it will be worthless”.
Some infrastructures and services were presented during the DI4R conference as “EOSC building blocks”. The presentations are all available following this link.
Two presentations attracted our attention:
Contributions are open on Github to help develop FAIR metrics for EOSC.
It seems that no e-infrastructure or service presented at the events specifically targets math-based research and teaching, so there could be some room for components of the OpendreamKit VRE.
Indeed there is an existing collaboration between EGI and OpenDreamKit for the deployment of JupyterHub in EGI services. This collaboration will become official in the next weeks with the signature of a Memorandum of Understanding between the two parties. Depending on the success of this joint work and of the need of the EOSC post-2020, the collaboration could be extended.
OpenDreamKit as a consortium promoting Open Source software and Open Data in the name of large communities can take lobbying actions to have an impact on the shape of the future EOSC. The following actions can and will be taken:
1) Endorse the principles of the EOSC declaration by sending an official statement by mail
2) Commit to take some of the specific actions forward
Endorsement and commitment must be sent at [email protected].
From External Board of EOSC working for the Commission: several experts including Jean-François ABRAMATIC (INRIA) and the chairperson John Womersley (European Spallation Source)
From the External Advisory Board of the EOSC-pilot project (launched in preparation of EOSC-Hub project): the list is available on their website. One expert, Françoise GENOVA, is also member of the OpenDreamKit Advisory Board.
Reference: Hans fangohr’s blogpost on his group page
OpenDreamKit is hosting a workshop on “Subgroups and lattices of Lie groups” to take place at the Faber residency in Olot, Spain from monday 19th of February to Saturday 3rd of March. The aim is to bring together experts in the geometry, algebra and combinatorics together with software developers in order to improve algorithms and functionalities of open source packages concerning Lie groups and their subgroups.
The organization page of the event is https://wiki.sagemath.org/days93.
This document is part of a collection of use cases.
Jane has written a (math) paper based on experimentations. She would like anyone to be able to reproduce her calculations.
Describe the experimentation as Jupyter notebooks, mixing prose, code, and outputs (think of them as logbooks). Publish them on a public repository (e.g. on GitHub, and make that repository binder-ready. Make the paper itself active (TODO: latexml+thebe?).
If executing the examples requires a non-trivial instal/build step,
also consider using a
Dockerfile
,
and auto-building the Docker image on https://hub.docker.com/.
TODO: links to favorite instances
TODO: estimate of the number of such instances.
Assuming Jane is familiar with version control and Jupyter (basic lab skills taught at Software Carpentry), and that the experiments were prepared as notebooks, the publishing part could take two hours the first time, and half an hour later on.
This is an announcement for a postdoc position opening at Université Paris-Sud, working on the interplay between Data, Knowledge, and Software in Mathematics, and in particular the exploitation of mathematical knowledge for increased interoperability across computational mathematics software and mathematical databases.
Interviews in early december, for a recruitment from early 2018 to Fall 2019. Since we have a strong candidate for a half time position, we will also consider candidates interested in a half-time or shorter duration position.
For a full-time month work and depending on the applicant’s past experience, between 2000€ and 3000€ of monthly “salaire net” (salary after non-wage abour cost but before income tax).
Equivalently, what this salary represents for is a “salaire brut” of up to 46200€ yearly (for a full-time position).
The postdoc will work at the Laboratoire de Recherche en Informatique of Université Paris Sud, in the Orsay-Bures-Gif-Saclay campus, 25 km South-West of Paris city centre.
OpenDreamKit’s Work Package 6 explores the interplay between Data, Knowledge and Software in Mathematics. In particular, it aims at exploiting mathematical knowledge for increased interoperability across computational mathematics software and mathematical databases (known as the Math-In-The-Middle approach). See e.g. the recent publications on that topic, and Section 3.1.6 ``Workpackage Description’’ of the OpenDreamKit Proposal.
A successful candidate will be expected to do significant progress, in close collaboration with the other OpenDreamKit participants and the community, on some of the tasks of this Work Package:
D6.8: Currated Math-in-the-Middle Ontology and Alignments for GAP / Sage / LMFDB
T6.5: Knowledge-based code infrastructure
Over the last decades, computational components, and in particular Axiom, MuPAD, \GAP, or \Sage, have embedded more and more mathematical knowledge directly inside the code, as a way to better structure it for expressiveness, flexibility, composability, documentation, and robustness. In this task we will review the various approaches taken in these software (e.g. categories and dynamic class hierarchies) and in proof assistants like Coq (e.g. static type systems), and compare their respective strength and weaknesses on concrete case studies. We will also explore whether paradigms offered by recent programming languages like Julia or Scala could enable a better implementation. Based on this we will suggest and experiment with design improvements, and explore challenges such as the compilation, verification, or interoperability of such code.
The candidate will be welcome to work on closely related though more technical tasks:
T4.5: Dynamic documentation and exploration system
Introspection has become a critical tool in interactive computation, allowing user to explore, on the fly, the properties and capabilities of the objects under manipulation. This challenge becomes particularly acute in systems like Sage where large parts of the class hierarchy is built dynamically, and static documentation builders like Sphinx cannot anymore render all the available information.
In this task, we will investigate how to further enhance the user experience. This will include:
On the fly generation of Javadoc style documentation, through introspection, allowing e.g. the exploration of the class hierarchy, available methods, etc.
Widgets based on the HTML5 and web component standards to display graphical views of the results of SPARQL queries, as well as populating data structures with the results of such queries,
D4.16: Exploratory support for semantic-aware interactive widgets providing views on objects of the underlying computational or database components. Preliminary steps are demonstrated in the Larch Environment project (see demo videos) and sage-explorer. The ultimate aim would be to automatically generate LMFDB-style interfaces.
Whenever possible, those features will be implemented generically for any computation kernel by extending the Jupyter protocol with introspection and documentation queries.
T6.9: Memoisation and production of new data
Many CAS users run large and intensive computations, for which they
want to collect the results while simultaneously working on software
improvements. GAP retains computed attribute values of objects
within a session; Sage currently has a limited cached_method
.
Neither offers storage that is persistent across sessions or
supports publication of the result or sharing within a
collaboration. We will use, extend and contribute back to, an
appropriate established persistent memoisation infrastructure, such
as python-joblib
, redis-simple-cache
or dogpile.cache
, adding
features needed for storage and use of results in mathematical
research. We will design something that is simple to deploy and
configure, and makes it easy to share results in a controlled
manner, but provides enough assurance to enable the user to rely on
the data, give proper credit to the original computation and rerun
the computation if they want to.
Strong experience in the design and practical implementation of mathematics software: computational mathematics software (e.g. SageMath), knowledge management systems, or proof systems;
PhD in mathematics or computer science;
Experience in open-source development (collaborative development tools, interaction with the community, …);
Fluency in programming languages such as Scala, Python, Julia, etc appreciated;
Strong communication skills;
Fluency in oral and written English; speaking French is not a prerequisite.
The position will be funded by
OpenDreamKit, a Horizon 2020 European Research Infrastructure project that will run for four years, starting from September
Within this ecosystem, the developer will work primarily on the free open-source mathematics software system Sagemath. Based on the Python language and many existing open-source math libraries, SageMath is developed since 10 years by a worldwide community of 300 researchers, teachers and engineers, and has reached 1.5M lines of code.
The developer will work within one of the largest teams of SageMath developers, composed essentially of researchers in mathematics and computer science, at the Laboratoire de Recherche en Informatique (LRI) and in nearby institutions. The LRI also hosts a strong team working on proof systems.
To apply for this position, please send an e-mail to Nicolas.Thiery at u-psud.fr before December 1st, with the following documents attached:
cover_letter.pdf: a cover letter, in English (why are you interested in this particular position);
CV.pdf: a CV, highlighting among other things your skills and background and your contributions to open source software;
phd_reports.pdf: PhD reports (when applicable);
reference letters (each named reference_letter_.pdf), or alternatively reference contact information.
Applications sent after December 1st will be considered until the position is filled.
Viviane Pons presented the OpenDreamKit project and its impact for teaching to the Netmath community
OpenDreamKit WP6 (Data/Knowledge/Software-Bases) has reported on the first use cases in two papers to be publised at MACIS 2017.
One of the main tasks for OpenDreamKit (T.31]) is improving portability of mathematical software across hardware platforms and operating systems.
One particular such challenge, which has dogged the SageMath project practically since its inception, is getting a fully working port of Sage on Windows (and by extension this would mean working Windows versions of all the CAS’s and other software Sage depends on, such as GAP, Singular, etc.)
This is particularly challenging, not so much because of the Sage Python library (which has some, but relatively little system-specific code). Rather, the challenge is in porting all of Sage’s 150+ standard dependencies, and ensuring that they integrate well on Windows, with a passing test suite.
Although UNIX-like systems are popular among open source software developers and some academics, the desktop and laptop market share of Windows computers is estimated to be more than 75% and is an important source of potential users, especially students.
However, for most of its existence, the only way to “install” Sage on Windows was to run a Linux virtual machine that came pre-installed with Sage, which is made available on Sage’s downloads page. This is clumsy and onerous for users–it forces them to work within an unfamiliar OS, and it can be difficult and confusing to connect files and directories in their host OS to files and directories inside the VM, and likewise for web-based applications like the notebook. Because of this Windows users can feel like second-class citizens in the Sage ecosystem, and this may turn them away from Sage.
Attempts at Windows support almost as old as Sage itself (initial Sage release in 2005). Microsoft offered funding to work on Windows version as far back as 2007 but was far too little for the amount of effort needed.
Additional work done was done off and on through 2012, and partial support was possible at times. This included admirable work to try to support building with the native Windows development toolchain (e.g. MSVC). There was even at one time an earlier version of a Sage installer for Windows, but long since abandoned.
However, Sage development (and more importantly Sage’s dependencies) continued to advance faster than there were resources for the work on Windows support to keep up, and work mostly stalled after 2013. OpenDreamKit has provided a unique opportunity to fund the kind of sustained effort needed for Sage’s Windows support to catch up.
As of SageMath version 8.0, Sage will be available for 64-bit versions of Windows 7 and up. It can be downloaded through the SageMath website, and up-to-date installation instructions are being developed at the SageMath wiki. A 32-bit version had been planned as well, but is on hold due to technical limitations that will be discussed later.
The installer contains all software and documentation making up the standard Sage distribution, all libraries needed for Cygwin support, a bash shell, numerous standard UNIX command-line utilities, and the Mintty terminal emulator, which is generally more user-friendly and better suited for Cygwin software than the standard Windows console.
It is distributed in the form of a single-file executable installer, with a familiar install wizard interface (built with the venerable InnoSetup. The installer comes in at just under a gigabyte, but unpacks to more than 4.5 GB in version 8.0.
Because of the large number of files comprising the complete SageMath distribution, and the heavy compression of the installer, installation can take a fair amount of time even on a recent system. On my Intel i7 laptop it takes about ten minutes, but results will vary. Fortunately, this has not yet been a source of complaints–beta testers have been content to run the installer in the background while doing other work–on a modern multi-core machine the installer itself does not use overly many resources.
If you don’t like it, there’s also a standard uninstall:
The installer include three desktop and/or start menu shortcuts:
The shortcut titled just “SageMath 8.0” launches the standard Sage command prompt in a text-based console. In general it integrates well enough with the Windows shell to launch files with the default viewer for those file types. For example, plots are saved to files and displayed automatically with the default image viewer registered on the computer.
(Because Mintty supports SIXEL mode graphics, it may also be possible to embed plots and equations directly in the console, but this has not been made to work yet with Sage.)
“SageMath Shell” runs a bash shell with the environment set up to run software in the Sage distribution. More advanced users, or users who wish to directly use other software included in the Sage distribution (e.g. GAP, Singular) without going through the Sage interface. Finally, “SageMath Notebook” starts a Jupyter Notebook server with Sage configured as the default kernel and, where possible, opens the Notebook interface in the user’s browser.
In principle this could also be used as a development environment for doing development of Sage and/or Sage extensions on Windows, but the current installer is geared primarily just for users.
There are a few possible routes to supporting Sage on Windows, of which Cygwin is just one. For example, before restarting work on the Cygwin port I experimented with a solution that would run Sage on Windows using Docker. I built an installer for Sage that would install Docker for Windows if it was not already installed, install and configure a pre-build Sage image for Docker, and install some desktop shortcuts that attempted to launch Sage in Docker as transparently as possible to the user. That is, it would ensure that Docker was running, that a container for the Sage image was running, and then would redirect I/O to the Docker container.
This approach “worked”, but was still fairly clumsy and error-prone. In order to make the experience as transparent as possible a fair amount of automation of Docker was needed. This could get particularly tricky in cases where the user also uses Docker directly, and accidentally interferes with the Sage Docker installation. Handling issues like file system and network port mapping, while possible, was even more complicated. What’s worse, running Linux images in Docker for Windows still requires virtualization. On older versions this meant running VirtualBox in the background, while newer versions require the Hyper-V hypervisor (which is not available on all versions of Windows–particularly “Home” versions). Furthermore, this requires hardware-assisted virtualization (HAV) to be enabled in the user’s BIOS. This typically does not come enabled by default on home PCs, and users must manually enable it in their BIOS menu. We did not consider this a reasonable step to ask of users merely to “install Sage”.
Another approach, which was looked at in the early efforts to port Sage to Windows, would be to get Sage and all its dependencies building with the standard Microsoft toolchain (MSVC, etc.). This would mean both porting the code to work natively on Windows, using the MSVC runtime, as well as developing build systems compatible with MSVC. There was a time when, remarkably, many of Sage’s dependencies did meet these requirements. But since then the number of dependencies has grown too much, and Sage itself become too dependent on the GNU toolchain, that this would be an almost impossible undertaking.
A middle ground between MSVC and Cygwin would be to build Sage using the MinGW toolchain, which is a port of GNU build tools (including binutils, gcc, make, autoconf, etc.) as well as some other common UNIX tools like the bash shell to Windows. Unlike Cygwin, MinGW does not provide emulation of POSIX or Linux system APIs–it just provides a Windows-native port of the development tools. Many of Sage’s dependencies would still need to be updated in order to work natively on Windows, but at the very least their build systems would require relatively little updating–not much more than is required for Cygwin. This would actually be my preferred approach, and with enough time and resources it could probably work. However, it would still require a significant amount of work to port some of Sage’s more non-trivial dependencies, such as GAP and Singular, to work on Windows without some POSIX emulation.
So Cygwin is the path of least resistance. Although bugs and shortcomings
in Cygwin itself occasionally require some effort to work around (as a
developer–users should not have to think about it), for the most part it
just works with software written for UNIX-like systems. It also has the
advantage of providing a full UNIX-like shell experience, so shell scripts
and scripts that use UNIX shell tools will work even on Windows. However,
since it works directly on the native filesystem, there is less opportunity
for confusion regarding where files and folders are saved. In fact, Cygwin
supports both Windows-style paths (starting with C:\\
) and UNIX-style
paths (in this case starting with C:/
).
Finally, a note on the Windows Subsystem for Linux (WSL), which debuted
shortly after I began my Cygwin porting efforts, as I often get asked about
this: “Why not ‘just’ use the ‘bash for Windows’?” The WSL is a new effort
by Microsoft to allow running executables built for Linux directly on
Windows, with full support from the Windows kernel for emulation of Linux
system calls (including ones like fork()
). Basically, it aims to provide
all the functionality of Cygwin, but with full support from the kernel, and
the ability to run Linux binaries directly, without having to recompile
them. This is great of course. So the question is asked if Sage can run in
this environment, and experiments suggest that it works pretty well
(although the WSL is still under active development and has room for
improvement).
I wrote more about the WSL in a blog post last year, which also addresses why we can’t “just” use it for Sage for Windows. But in short: 1) The WSL is currently only intended as a developer tool: There’s no way to package Windows software for end users such that it uses the WSL transparently. And 2) It’s only available on recent updates of Windows 10–it will never be available on older Windows versions. So to reach the most users, and provide the most hassle-free user experience, the WSL is not currently a solution. However, it may still prove useful for developers as a way to do Sage development on Windows. And in the future it may be the easiest way to install UNIX-based software on Windows as well, especially if Microsoft ever expands its scope.
The main challenge with porting Sage to Windows/Cygwin has relatively little to do with the Sage library itself, which is written almost entirely in Python/Cython and involves relatively few system interfaces (a notable exception to this is the advanced signal handling provided by Cysignals, but this has been found to work almost flawlessly on Cygwin thanks to the Cygwin developers’ heroic efforts in emulating POSIX signal handling on Windows). Rather, most of the effort has gone into build and portability issues with Sage’s more than 150 dependencies.
The majority of issues have been build-related issues. Runtime issues are less common, as many of Sage’s dependencies are primarily mathematical, numerical code–mostly CPU-bound algorithms that have little use of platform-specific APIs. Another reason is that, although there are some anomalous cases, Cygwin’s emulation of POSIX (and some Linux) interfaces is good enough that most existing code just works as-is. However, because applications built in Cygwin are native Windows applications and DLLs, there are Windows-specific subtleties that come up when building some non-trivial software. So most of the challenge has been getting all of Sage’s dependencies building cleanly on Cygwin, and then maintaining that support (as the maintainers of most of these dependencies are not themselves testing against Cygwin regularly).
In fact, maintenance was the most difficult aspect of the Cygwin port (and this is one of the main reasons past efforts failed–without a sustained effort it was not possible to keep up with the pace of Sage development). I had a snapshot of Sage that was fully working on Cygwin, with all tests passing, as soon as the end of summer in 2016. That is, I started with one version of Sage and added to it all the fixes needed for that version to work. However, by the time that work was done, there were many new developments to Sage that I had to redo my work on top of, and there were many new issues to fix. This cycle repeated itself a number of times.
The critical component that was missing for creating a sustainable Cygwin port of Sage was a patchbot for Cygwin. The Sage developers maintain a (volunteer) army of patchbots–computers running a number of different OS and hardware platforms that perform continuous integration testing of all proposed software changes to Sage. The patchbots are able, ideally, to catch changes that break Sage–possibly only on specific platforms–before they are merged into the main development branch. Without a patchbot testing changes on Cygwin, there was no way to stop changes from being merged that broke Cygwin. With some effort I managed to get a Windows VM with Cygwin running reliably on UPSud’s OpenStack infrastructure, that could run a Cygwin patchbot for Sage. By continuing to monitor this patchbot the Sage community can now receive prior warning if/when a change will break the Cygwin port. I expect this will impact only a small number of changes–in particular those that update one of Sage’s dependencies.
In so doing we are, indirectly, providing continuous integration on Cygwin for Sage’s many dependencies–something most of those projects do not have the resources to do on their own. So this should be considered a service to the open source software community at large. (I am also planning to piggyback on the work I did for Sage to provide a Cygwin buildbot for Python–this will be important moving forward as the official Python source tree has been broken on Cygwin for some time, but is one of the most critical dependencies for Sage).
All that said, a few of the runtime bugs that come up are non-trivial as well. One particular source of bugs is subtle synchronization issues in multi-process code, that arise primarily due to the large overhead of creating, destroying, and signalling processes on Cygwin, as compared to most UNIXes. Other problems arise in areas of behavior that are not specified by the POSIX standard, and assumptions are made that might hold on, say, Linux, but that do not hold on Cygwin (but that are still POSIX-compliant!) For example, a difference in (undocumented, in both cases) memory management between Linux and Cygwin made for a particularly challenging bug in PARI. Another interesting bug came up in a test that invoked a stack overflow bug in Python, which only came up on Cygwin due to the smaller default stack size of programs compiled for Windows. There are also occasional bugs due to small differences in numerical results, due to the different implementation of the standard C math routines on Cygwin, versus GNU libc. So one should not come away with the impression that porting software as complex as Sage and its dependencies to Cygwin is completely trivial, nor that similar bugs might not arise in the future.
The original work of porting Sage to Cygwin focused on the 32-bit version of Cygwin. In fact, at the time that was the only version of Cygwin–the first release of the 64-bit version of Cygwin was not until 2013. When I picked up work on this again I focused on 64-bit Cygwin–most software developers today are working primarily on 64-bit systems, and so from many projects I’ve worked on the past my experience has been that they have been more stable on 64-bit systems. I figured this would likely be true for Sage and its dependencies as well.
In fact, after getting Sage working on 64-bit Cygwin, when it came time to test on 32-bit Cygwin I hit some significant snags. Without going into too many technical details, the main problem is that 32-bit Windows applications have a user address space limited to just 2 GB (or 3 GB with a special boot flag). This is in fact not enough to fit all of Sage into memory at once. The good news is that for most cases one would never try to use all of Sage at once–this is only an issue if one tries to load every library in both Sage, and all its dependencies, into the same address space. In practical use this is rare, though this limit can be hit while running the Sage test suite.
With some care, such as reserving address space for the most likely to be used (especially simultaneously) libraries in Sage, we can work around this problem for the average user. But the result may still not be 100% stable.
It becomes a valid question whether it’s worth the effort. There are unfortunately few publicly available statistics on the current market share of 64-bit versus 32-bit Windows versions among desktop users. Very few new desktops and laptops sold anymore to the consumer market include 32-bit OSes, but it is still not too uncommon to find on some older, lower-end laptops. In particular, some laptops sold not too long ago with Windows 7 were 32-bit. According to Net Market Share, as of writing Windows 7 still makes up nearly 50% of all desktop operating system installments. This still does not tell us about 32-bit versus 64-bit. The popular (12.5 million concurrent users) Steam PC gaming platform publishes the results of their usage statistics survey, which as of writing shows barely over 5% of users with 32-bit versions of Windows. However, computer gamers are not likely to be representative of the overall market, being more likely to upgrade their software and hardware.
So until some specific demand for a 32-bit version of SageMath for Windows is heard, we will not likely invest more effort into it.
Focusing on Cygwin for porting Sage to Windows was definitely the right way to go. It took me only a few months in the summer of 2016 to get the vast majority of the work done. The rest was just a question of keeping up with changes to Sage and fixing more bugs (this required enough constant effort that it’s no wonder nobody managed to quite do it before). Now, however, enough issues have been addressed that the Windows version has remained fairly stable, even in the face of ongoing updates to Sage.
Porting more of Sage’s dependencies to build with MinGW and without Cygwin might still be a worthwhile effort, as Cygwin adds some overhead in a few areas, but if we had started with that it would have been too much effort.
In the near future, however, the priority needs to be improvements to user experience of the Windows Installer. In particular, a better solution is needed for installing Sage’s optional packages on Windows (preferably without needing to compile them). And an improved experience for using Sage in the Jupyter Notebook, such that the Notebook server can run in the background as a Windows Service, would be nice. This feature would not be specific to Sage either, and could benefit all users of the Jupyter Notebook on Windows.
Finally, I need to better document the process of doing Sage development on Cygwin, including the typical kinds of problems that arise. I also need to better document how to set up and maintain the Cygwin patchbot, and how to build releases of the Sage on Windows installer so that its maintenance does not fall solely on my shoulders.
Dear Datadog,
Everybody on my team was completely mislead by your
horrible pricing description.
Please cancel the subscription for wstein immediately
and remove my credit card from your system.
This is the first time I've wasted this much money
by being misled by a website in my life.
I'm also very unhappy that I can't delete my credit
card or cancel my subscription via your website. It's
like one more stripe API call to remove the credit card
(I know -- I implemented this same feature for my site).
Thanks for reaching out. If you'd like to cancel your
Datadog subscription, you're able to do so by going into
the platform under 'Plan and Usage' and choose the option
downgrade to 'Lite', that will insure your credit card
will not be charged in the future. Please be sure to
reduce your host count down to the (5) allowed under
the 'Lite' plan - those are the maximum allowed for
the free plan.
Also, please note you'll be charged for the hosts
monitored through this month. Please take a look at
our billing FAQ.
by William Stein ([email protected]) at September 22, 2017 01:03 PM
07/22/2016 449215JWJH87S8N4 DATADOG 866-329-4466 NY $639.19
08/29/2016 2449215L2JH87V8WZ DATADOG 866-329-4466 NY $927.22
by William Stein ([email protected]) at September 22, 2017 12:32 PM
Viviane Pons gave a Sage Introduction talk at the internation combinatorics conference Eurocomb 2017
OpenDreamKit is hosting a workshop on live structured documents to take place at Simula Research Laboratory in Oslo, Norway from Monday 16. October to Friday 20. October.
The workshop is dedicated to various aspects of live documents, including:
Participants can register via eventbrite.
Groups St Andrews is a conference series with a conference every four years. This year’s Groups St Andrews will be in Birmingham, and I will attend, bring a poster, give a contributed talk about computing in permutation groups, and teach a course on GAP.
This post serves the main purpose of providing a page on the OpenDreamKit website that can hold all the links that will appear on the poster, and possible further information.
One of the deliverables (D4.13) of the OpenDreamKit project is refactoring the documentation system of SageMath. The SageMath documentation is built using a heavily customized Sphinx. Many of the customizations are neccessary to support autodoc (automatically generated documentation from docstrings) for Cython files.
Thanks to some changes I made to Sphinx, autodoc for Cython now works provided that:
You use Sphinx version 1.6 or later.
The Cython code is compiled with the binding=True
directive.
See How to set directives
in the Cython documentation.
A small monkey-patch is applied to inspect.isfunction
.
You can put this in your Sphinx conf.py
for example:
def isfunction(obj):
return hasattr(type(obj), "__code__")
import inspect
inspect.isfunction = isfunction
This was used successfully for the documentation of cysignals and fpylll. There is ongoing work to do the same for SageMath.
To understand why items 2 and 3 on the above list are needed, we need to look at how Python implements functions. In Python, there are two kinds of functions (we really mean functions here, not methods or other callables):
User-defined functions, defined with def
or lambda
:
>>> def foo(): pass
>>> type(foo)
<class 'function'>
>>> type(lambda x: x)
<class 'function'>
Built-in functions such as len
, repr
or isinstance
:
>>> type(len)
<class 'builtin_function_or_method'>
In the CPython implementation, these are completely independent classes with different behaviours.
Just to give one example, built-in functions do not have a
__get__
method, which means that they do not become methods when used in a class.
Let’s consider this class:
class X(object):
def printme(self):
return repr(self)
This is essentially equivalent to
class X(object):
printme = (lambda self: repr(self))
>>> X().printme()
'<__main__.X object at 0x7fb342f960b8>'
However, directly putting the built-in function repr
in the class
does not work as expected:
class Y(object):
printme = repr
>>> Y().printme()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: repr() takes exactly one argument (0 given)
This is simply something that built-in functions do not support.
Here is a list of the main differences between user-defined and built-in functions:
User-defined functions are implemented in Python, built-in functions are implemented in C.
Only user-defined functions support __get__
and can become methods
(see above).
Only user-defined functions support introspection such as
inspect.getargspec()
and inspect.getsourcefile()
.
CPython has specific optimizations for calling built-in functions.
The inspect
module and profiling make
a difference between the two kinds of functions.
Cython generates C code, so Cython functions must be built-in functions. This has unfortunate disadvantages, such as the lack of introspection support, which is particularly important for Sphinx.
Luckily, the Cython developers came up with a solution:
they invented a completely new function type (called cyfunction
),
which is implemented like built-in functions
but which behaves as much as possible like user-defined functions.
By default, functions in Cython are built-in functions.
With the directive binding=True
, functions in Cython become cyfunctions.
Since cyfunctions are not specifically optimized by CPython,
this comes with a performance penalty.
More precisely, calling cyfunctions from Python is slower than
calling built-in functions from Python.
The slowdown can be significant for simple functions.
Within Cython, cyfunctions are as fast as built-in functions.
Since a cyfunction
is not a built-in function
nor a user-defined function (those two types are not subclassable),
the inspect
module (and hence Sphinx) does not recognize it as being a function.
So, to have full inspect support for Cython functions,
we need to change inspect.isfunction
.
After various attempts, I came up with hasattr(type(obj), "__code__")
to test whether the object obj
is a function (for introspection purposes).
This will match user-defined functions and cyfunctions
but not built-in functions, nor any other Python type that I know of.
I have some vague plans for a Python Enhancement Proposal (PEP) to change the implementation of the Python function types. The goal is that Cython functions can be implemented on top of some standard Python function type, with all features that cyfunctions currently have, the performance of built-in functions and introspection support of user-defined functions.
At this point, it is too early to say anything about the implementation of this hypothetical future Python function type. If anything happens, I will surely post an update.
"Google open source guru Chris DiBona says that the web giant continues to ban the lightning-rod AGPL open source license within the company because doing so "saves engineering time" and because most AGPL projects are of no use to the company."This is just the way it is -- it's psychology and culture, so deal with it. In contrast, companies very frequently embrace open source code that is licensed under the Apache or BSD licenses, and they keep such projects alive. The extremely popular PostgreSQL database is licensed under an almost-BSD license. MySQL is freely licensed under the GPL, but there are good reasons why people buy a commercial MySQL license (from Oracle) for MySQL. Like RethinkDB, MongoDB is AGPL licensed, but they are happy to sell a different license to companies.
by William Stein ([email protected]) at June 01, 2017 01:25 PM
This project seeks to implement several common matroid classes in SageMath, along with algorithms for their display and relevant computations. The graphic matroid class in particular will be implemented with a representative graph with methods for Whitney switching and minor operations. This will be accompanied by improvements to the graph theory library, with methods relevant to matroids enabled to support multigraphs. Other modules for this project include improved plotting of rank 3 matroids to eliminate false colinearities, computation of a matroid's automorphism group using SageMath's group theory libraries, and faster minor testing based on an existing trac ticket.
As a member of the sage-dynamics community, researchers have compiled a wishlist for algorithms and functionality they would like added. I would like to shorten the wish list for us.For my project I will be completing some desired additions to SAGE from the Sage Dynamics Wiki. I will implement Well’s Algorithm, strengthen the numerical precision in cannonical_height, as well as implement reduced_form for higher dimensions.
There are three major things that I would like to implement to improve the functionality of Sage in the area Complex Dynamics. The details of the project are summarized in the following list:
- Complex Dynamics Graphical package: Integrate or implement a complex dynamics software such as Mandel into Sage. This will be done by creating an optional package for Sage. If there is enough demand, the package may become a standard package for Sage at some point.
- Spider Algorithm: The object of the Spider Algorithm is to construct polynomials with assigned combinatorics. For example, we may want to find a polynomial that has a periodic orbit of period 7. The Spider Algorithm provides a way for us to compute this polynomial efficiently. I plan to implement this algorithm into Sage.
- Coercion: If you have a map defined over Q, you should be able to take the image of a point over C (i.e. somewhere you have a well-defined embedding) without having to use the command "change_ring()". Something similar works for polynomials in Sage but it does not work for morphisms/schemes.
This project is aimed at providing linear time implementation for modular decomposition of graphs and digraphs. Modular decomposition is decomposition of graph into modules. A module is a subset of vertices and it is a generalization of connected component in graph. Let us take for example a module X. For any vertex v ∉ X it is either connected or not connected to every vertex of X. Another property of module is that a module can be subset of another module. There are various algorithms which have been published for modular decomposition of graphs. The focus in this project is on linear time complexity algorithms which can be practically implemented. The project further aims to use the modules developed for modular decomposition to implement other functionality like skew partitions. Skew partition is partition of graph into two sets of vertices such that induced graph formed by one set is disconnected and induced graph formed by other set is complement of the first. Modular decomposition is a very important concept in Graph Theory and it has a number of use cases. For instance it has been an important tool for solving optimization and combinatorics problems.
Modular decomposition of (di)graphs is a generalization of the concept of the decomposition of (di)graphs into connected components. Its current implementation in Sage relies on badly broken abandoned C code, and badly needs to be replaced by something that works and is not too slow. However, the only open-source implementations of some of these procedures are either in Java or in Perl, and thus aren't really useful for Sage.
I aim to implement visualizations of several key constructs in cluster algebras and quiver representations. The first is Auslander-Reiten quivers, for at least the A_n and D_n cases. The second is labelled endomorphism quivers and mutations within a cluster category, focusing on the A_n case. The third is posets of down-mutations for the A_n case. These features will be useful not only for research purposes, but also as nice examples to play around with and learn from. Aside from these features, I am interested in implementing features for the Quantum Cluster Algebras project.
by Harald Schilly ([email protected]) at May 05, 2017 06:03 PM
We are happy to announce release 0.3.0 of nbdime, continuing to improve the process of working with Jupyter notebooks in version control.
The highlight of 0.3 is much improved integration with git, making it easier than ever to get started with nbdime in git:
pip install --upgrade nbdime # install nbdime
nbdime config-git --global --enable # tell git to use nbdime when it sees notebooks
and you can get nice GUI diffs directly from git refs on the command-line:
nbdiff-web master mynotebook.ipynb
On April 26th, OpenDreamKit underwent its first formal review by the European Commission. We presented the achievements of the first 18 months of the project, including 30 deliverables (reports, slides). Overall, the feedback was very positive, with language such as “enthusiast”, “brilliant”, “amazing job”, or “things have come along fantastically”. We made a strong point in our reports and presentations that a vast majority of what’s happening comes from the ecosystem we support. All we do is exploit the special resources the EU is entrusting us to knock down some tough hurdles that are preventing the ball to roll. Kudos to our communities!
About twenty of us were in Brussels early this week for the OpenDreamKit Month 18 formal review. After two days of intensive preparation, we presented our work on Wednesday to our project officer and reviewers.
There are a few points that we need to think about (not unexpected). But otherwise the hard work we all put since the beginning of the project came out as quite a show. The panel gave very constructive feedback and were overall really happy. They appreciate our approach, our work, our spirit.
Now is the time to enjoy that appreciation and build on that energy to do even better in the coming years. Pass this on to our communities!
Speaking of funding: the reviewers made a strong point that we bear a big responsibility: apparently mathematics does not have a good press in the high spheres these days. We were very lucky, as a math project, to be funded; it’s really because they appreciated so much the strength of the proposal and our “clever and creative interpretation of the call” that we made it through. “No other math project is being funded” (this quote obviously does not apply to ERCs; the scope is plausibly that of H2020 projects).
They now need strong ammunition to make sure that future calls leave room for mathematics. So not only do we have to succeed because we care so much about our aims (and should investigate followups to pursue them further), but also for the sake of other projects elsewhere in mathematics. We also need to proactively explain and highlight to a wider audience what we do in collaboration with our communities. There is very good stuff going on, let it be seen.
Some further thoughts needs to be put in how to achieve that. For now, the take home message is simple: If you witness something nice happening, from a technical achievement with a wow factor to a thought provoking anecdote, write a blog post about it. See the instructions, or even just send a brief draft text by e-mail to Mike Croucher with me in CC.
Let me conclude by thanking the whole band that came to Brussels (with a special nod to the presenters on which we dumped the most delicate presentations). I was frustrated as you all were spending all this time together without tackling what we all really care most. However we built on our image in the Commission, and used the occasion to strengthen our group around a joint vision. This is a worthwhile long term investment.
Thank you everybody for all the enthusiastic, dedicated and beautiful work. It’s an honor and a pleasure to be working with such a team.
Remember: pass it on to those supporting you and to your communities.
Cheers, Nicolas
At the occasion of its first (very successful!) formal review by the EU commission, twenty OpenDreamKit participants met on the last week of April 2017 at the CLORA (Club of associated research organisations) headquarters in Brussels.
A framadate poll was created
Nicolas THIERY; Benoît PILORGET; Erik BRAY; Viviane PONS; Vincent DELECROIX; Michael KOHLHASE; Dennis MUELLER; Florian RABE; Tom WIESING; Clément PERNET; Wolfram DECKER; William HART; Dmitrii PASECHNIK; Marcin KOSTUR; Mike CROUCHER; Hans FANGOHR; Alexander KONOVALOV; Stephen LINTON; Luca DE FEO; John CREMONA; Paul-Olivier DEHAYE; Benjamin RAGAN-KELLEY; Jeroen DEMEYER; Konrad HINSEN
Benoît Pilorget (BP) announced that the Second Amendment should be over in April-May. All the modifications concerning deliverables and the scientific context were accepted. The remaining blocking points were purely administrative and require some time. At the moment these notes are being written (19/05/2017), the Commission seems to have fully agreed on all terms and is about to sign the amendment.
Related to this amendment, the consortium expressed their congratulations to Hans Fangohr (HF) for his new position at XFEL, in Hamburg. All points of the amendment can be found on the github issue #193
An open brainstorming session took place. Resulting from the retirement of Ursula Martin, it appears that some aspects of WP7 that require research-grade expertise in sociology will be hardly achievable as it is organised today with the current consortium. Therefore solutions must be found so that we don’t just tick the boxes but actually deliver high quality material.
Several options have been discussed:
1) Hire new staff specialised in sociology or likewise field. This solution would probably lead to the transfer of some funding within the consortium, unless it turns out enough Person-Months are planned at UOXF
2) Subcontract the planned work not feasible. For this solution to work out, one must find an adequate subcontractor (providing enough funds are available within UOXF or the consortium) and sign an amendment to the Grant with the Commission
3) Rethink the scientific content (objectives, tasks, deliverables), to take into account all we have learned since the writing of the proposal, and make the best use of the available ressources and consortium expertise. This of course would require a negotiation with the EU and probably a new amendment to the grant agreement.
The consortium is expecting official feedback from the Project Officer and reviewers after the formal review. In the meantime, the Coordinator and Principal Investigators of WP7 will be informally brainstorming all possibilities. Were an amendment necessary, it will be written after the current amendment for the addition of FAU Erlangen and XFEL is signed by the Commission.
After a tour de table, the Coordinator ensured that all deliverables due for Month 24 (31/08/2017) have a leader and a definite working plan.
BP reminded the consortium of the support slides for the Review that were presentend at the Edinburgh steering committee meeting.
HF, the chair of the Quality Review Board (QRB), expressed rough positive feedback from the first QRB meeting. A full report will be made available for all participants.
OpenDreamKit uses a static website powered by Jekyll and GitHub. Ever wondered what it means? Read this post to discover.
Last January, Viviane Pons, Jessica Striker and Jennifer Balakrishnan organized the first WomenInSage event in Europe with OpenDreamKit. 20 women spent a week together coding and learning in a rented house in the Paris area.
To open the workshop, Viviane, Jessica, and Jennifer gave a series of introduction to Sage lectures at the Institut Henri Poincaré in Paris, covering combinatorics and number theory.
The workshop then moved to the rented house. There, we organized short talk sessions to get to know our respective research fields and expectations for the week. After that, we were able to split into small groups to work on many different projects: STL export, Krummer surfaces, Kuznyechik cipher, Motzkin words, Shioda invariants, and more. We also had presentations on How to contribute to Sage (with a crash course on git) and How to write a Sage package. Every evening, we had a Status report session to share our progress with the group. You can read our program and final status reports on the event wikipage.
Viviane Pons is one of the organizer of the local Paris chapter for PyLadies. She organized a meeting between the WomenInSage mathematician and the PyLadies developers. We were welcomed by Algolia for an afternoon of coding-and-chatting with the PyLadies.
The data presented here come from a post-event questionnaire sent to the participants.
The gender gap is very important in the mathematic development community. In the OpenDreamKit project, among the 54 participants we are only 3 women. This reflects the global situation in the field. Many mathematician women are still hesitant to join our community and lack confidence in their abilities as developers. Organizing a women targeted event is a way to motivate them and building up self-confidence in a safe and casual atmosphere.
The women who attended the conference had various level of programming experience ranging from 1 (no experience) to 5 (a lot of experience).
This disparity also reflected in their knowledge of Sage.
As for contributions, only 4 participants had contributed to Sage in the past which included the 3 organizers. Also, a majority of participants had never attended a Sage Days before. Actually, 6 of them had never even heard of Sage Days and 2 of them said they did not think it was “for them”.
To the question “How did the fact that the event was targeted to women impact your decision to come? (Would you have participated in a classical SageDays)”, Many participants answered that it was indeed a factor a their decision.
Yes, but it helped. I didn´t feel so sure about my skills and being surrounded by women made things easier.
It was a new experience that I don’t regret at all.
I might have participated, but would have been less confident.
I have participated in and benefited from classical SageDays, but found this event to be even better at creating an atmosphere where everyone felt empowered to learn and contribute.
I made a special effort I would not have done for regular sage days.
One of the participant said she would not have felt comfortable sharing a house with men but that this event was such positive experience that she would now consider it for other Sage days. The event helped building up the confidence of the participants, 9 of them said they felt more confident to attend classical Sage Days after the event.
We took advantage of the diverse knowledge background of our group to work together and learn from each other. It was an occasion for many “first times” among participants who had very little experience with Sage:
We worked on 14 tickets during the week, 6 of those which have been merged since the conference. All participants said they had learned new things and it would impact their careers.
This also was an occasion to start projects and form more research and development collaborations for future.
All of this happened in a very casual and welcoming atmosphere. We used the common rooms of the house to work. We cooked international, vegetarian friendly meals (some participants had brought food and recipes from their home countries). We got to know each other and shared more than code. All participants agreed that it was a very positive experience. When asked to rate the general atmosphere of the conference, all of them gave a 5.
As an organizer, it was also very rewarding and it motivates me to do it again. To the question: “Any other comment you might have?”, we only got one answer.
All three organizers were so very generous with their time and expertise, and created a wonderful supportive environment. Thank-you!
This week the KWARC team (Michael Kohlhase, Florian Rabe, Dennis Müller) and myself met in Berlin at the WIAS. The goal was to meet some of the modelers working there, who are very interested in the MMT system and the work in OpenDreamKit. Their entry point is Work Package 6 (interoperability), motivated by the benefits they would get intrinsically from formalizing the mathematical work they do into the OMDoc/MMT language (e.g. addressability of mathematical models), but also with an eye on all the other work packages from OpenDreamKit (e.g. interactive documents). Personally, I was focused on working out what I could of a semantic interchange between Sage and GAP of mathematical objects.
To start, we decided to do a bit of prototyping around transitive groups. The first step in the Math-in-the-Middle methodology for interoperability between computer algebra systems is to formalize the mathematical concept itself. Recent progress on the MMT language has actually made this very practical (see also here):
A mathematician should be able to point to this and get near universal agreement in the community on what that means.
Line 24 is of course critical to the definition, but one can see that the rest is well structured and readable. I have omitted here the first five lines, which consist of include
statements, and make the whole thing a completely formal definition yet implemented at a very high level of abstraction. You could slim down those include
s and build the same thing on flexiformal foundations, e.g. not bother with the logic “deep down”.
Overall, not many mathematicians might be able to write this, but almost any mathematician can navigate her way through it. It also helps that the jEdit editor and the MathHub webserver have drastically improved, especially in ease of use (work done as part of Work Package 4), but also installation and resilience (work done as part of Work Package 3).
Now that we have a target formalization, the idea is to separately make Sage and GAP interact with it. In the Math-in-the-Middle (MitM) formalism adopted for Work Package 6, we think of having in the “center” a system-independent flexiformalization of the mathematical domains (represented in this diagram in blue; replace in your head EC for elliptic curves with TG for transitive groups).
The next step is to work on the reddish clouds, which are the interface theories between this center and the other systems. These interface theories mainly flexiformalize the system-specific aspects of the domain.
On the GAP side, GAP generates for those interfaces OMDoc/MMT Content Dictionaries (CDs) that contain name, type, and documentation for all API functions (constructors, predicates, methods, …). This is automated, has good coverage and is very rich semantically (more on that towards the end of the post). The next step of the plan is then to align the generated system CDs with the MitM formalization by the MMT implements
relation of aligment (e.g. an aligment could be: GAP-transitive_group
MMT-implements
MitM-transitive group
). If equivalent Sage CDs were available, as well as Sage alignments, we would get a semantic crosswalk between GAP and Sage by composing the MitM alignments between all those different CDs. This would provide the necessary framework for interoperability.
At the moment Sage does export some of its knowledge into CDs, thanks to what was implemented by Nicolas Thiéry, leveraging his category
framework. This is unfortunately not enough to cover transitive groups, which have rich structure as category objects (but the “Category of Transitive Groups” does not exist in Sage). Given the circumstances of this workshop, I thus decided to focus on the Sage side, and see what information I could extract about transitive groups.
If you look at Sage’s TransitiveGroup
, a lot of mathematical knowledge is acquired from elsewhere through the class hierarchy lying above TransitiveGroup
, and the category framework that instruments that hierarchy. This lead me to first try to build a model of how the Sage class TransitiveGroup
was actually implemented and what it was doing, but this was a mistake. Indeed, it was very difficult, as I got lost between meta-logics and what I was actually trying to do: modeling Sage? modeling how Sage models math? how python uses Sage to model math? I was trying to do too much, too early and was probably the wrong person to do that.
If you look back at the methodology, the MitM CDs don’t need to link up to the Math-in-the-Middle content dictionary right away. This is actually up to the alignments, that come later (and could be done by a different person). I was trying to do both at once, while my focus should really have been: “how do I export, but not align, as much of the math knowledge as possible embedded into Sage into a language that can easily be processed by the KWARC team?” (for the categories
export built by Nicolas Thiéry, the export went through JSON).
OK then, the question now becomes: “where is math knowledge embedded in Sage that is relevant to the mathematical concept of transitive group?” The first response is of course still “Everywhere!”, but where are actually the low hanging fruits?
I found that the best way to communicate around this issue with the KWARC team is by extracting from Sage code a “math skeleton”. For this, the Sage-specific module sageinspect
was very useful. I thus introspected the sage object corresponding to the class TransitiveGroup
, and related objects:
# sage/src/sage/structure/sage_object.pyx
cdef class SageObject:
# sage/local/lib/python2.7/site-packages/sage/categories/category.py
class Category(UniqueRepresentation, SageObject):
# sage/src/sage/structure/category_object.pyx
cdef class CategoryObject(SageObject):
# sage/local/lib/python2.7/site-packages/sage/structure/parent.pyx
cdef class Parent(category_object.CategoryObject):
# sage/src/sage/groups/group.pyx
cdef class Group(Parent):
# sage/src/sage/groups/group.pyx
cdef class FiniteGroup(Group):
# sage/local/lib/python2.7/site-packages/sage/groups/perm_gps/permgroup.py
class PermutationGroup_generic(group.FiniteGroup):
# sage/local/lib/python2.7/site-packages/sage/groups/perm_gps/permgroup_named.py
class PermutationGroup_unique(CachedRepresentation, PermutationGroup_generic):
# sage/local/lib/python2.7/site-packages/sage/groups/perm_gps/permgroup_named.py
class TransitiveGroup(PermutationGroup_unique):
What is mathematical here? Clearly, just about everything, but that is because I was selective in the printout given above: I worked up the class hierarchy from TransitiveGroup
by hand, but excluded all the python objects that don’t inherit from SageObject
. For instance, you don’t see in that list:
# sage/local/lib/python2.7/site-packages/sage/structure/unique_representation.py
class CachedRepresentation:
CachedRepresentation
is only relevant, from a mathematical standpoint, in where it appears as a superclass. Its own internals are pure design decisions for CAS software, not mathematics.
The criterion to use for “related objects” is thus that only objects inheriting from SageObject
need to be navigated. So we are navigatin in the class hierarchy diamond between TransitiveGroup
and SageObject
, collecting classes, which I manually imported from the sage
library (obviously this could be automated):
from sage.structure.sage_object import SageObject
from sage.structure.category_object import Category # not strictly in the class hierarchy, but included to facilitate discussion
from sage.structure.category_object import CategoryObject
from sage.structure.parent import Parent
from sage.groups.group import Group
from sage.groups.group import FiniteGroup
from sage.groups.perm_gps.permgroup import PermutationGroup_generic
from sage.groups.perm_gps.permgroup_named import PermutationGroup_unique
from sage.groups.perm_gps.permgroup_named import TransitiveGroup
This is how I selected the objects from which I wanted to extract more information, producing the list of class definitions above.
[Note by the way the weird changes in the path to sageinspect.sage_getsource
in the listing above (why??? because of interactions between import
statements?)]
The next step is to add a bit of flesh to that skeleton export. Obviously this is going to be more intricate. I have included here what you get when you look at all the methods coming out of the source code for TransitiveGroup
, PermutationGroup_unique
, etc. In other words, a completely static navigation to the specific methods. This was the right thing to do for communicating with the KWARC team, but is wrong for our ultimate purpose. It was the right thing to do to communicate with KWARC (or in a blog post) as it distilled Sage to its most interesting bits, and we could fill the gaps relying on comment concepts (like “class hierarchy”). However, as a quicker way to get more consistent and richer Sage output, I could have navigated dynamically to the relevant classes, and extracted all the methods available from the live objects. This is of course because tons of methods get added when the object gets created, with a lot of mathematics packed into that. The same math could be reconstructed from the source code, but obviously that would be harder to do as we would be re-emulating a lot of what python does.
In any case, here is the full printout of what I get for just the method declarations for PermutationGroup_generic
, the Parent
that is most interesting:
# sage/local/lib/python2.7/site-packages/sage/groups/perm_gps/permgroup.py
class PermutationGroup_generic(group.FiniteGroup):
def __init__(self, gens=None, gap_group=None, canonicalize=True, domain=None, category=None):
def construction(self):
def _has_natural_domain(self):
def _gap_init_(self):
def _magma_init_(self, magma):
def __cmp__(self, right):
def _element_class(self):
def __call__(self, x, check=True):
def _coerce_impl(self, x):
def list(self):
def __contains__(self, item):
def has_element(self, item):
def __iter__(self):
def gens(self):
def gens_small(self):
def gen(self, i=None):
def identity(self):
def exponent(self):
def largest_moved_point(self):
def degree(self):
def domain(self):
def _domain_gap(self, domain=None):
def smallest_moved_point(self):
def representative_action(self,x,y):
def orbits(self):
def orbit(self, point, action="OnPoints"):
def transversals(self, point):
def stabilizer(self, point, action="OnPoints"):
def base(self, seed=None):
def strong_generating_system(self, base_of_group=None):
def _repr_(self):
def _latex_(self):
def _order(self):
def order(self):
def random_element(self):
def group_id(self):
def id(self):
def group_primitive_id(self):
def center(self):
def socle(self):
def frattini_subgroup(self):
def fitting_subgroup(self):
def solvable_radical(self):
def intersection(self, other):
def conjugacy_class(self, g):
def conjugacy_classes(self):
def conjugate(self, g):
def direct_product(self, other, maps=True):
def semidirect_product(self, N, mapping, check=True):
def holomorph(self):
def subgroup(self, gens=None, gap_group=None, domain=None, category=None, canonicalize=True, check=True):
def as_finitely_presented_group(self, reduced=False):
def quotient(self, N):
def commutator(self, other=None):
def cohomology(self, n, p = 0):
def cohomology_part(self, n, p = 0):
def homology(self, n, p = 0):
def homology_part(self, n, p = 0):
def character_table(self):
def irreducible_characters(self):
def trivial_character(self):
def character(self, values):
def conjugacy_classes_representatives(self):
def conjugacy_classes_subgroups(self):
def subgroups(self):
def _regular_subgroup_gap(self):
def has_regular_subgroup(self, return_group = False):
def blocks_all(self, representatives = True):
def cosets(self, S, side='right'):
def minimal_generating_set(self):
def normalizer(self, g):
def centralizer(self, g):
def isomorphism_type_info_simple_group(self):
def is_abelian(self):
def is_commutative(self):
def is_cyclic(self):
def is_elementary_abelian(self):
def isomorphism_to(self, right):
def is_isomorphic(self, right):
def is_monomial(self):
def is_nilpotent(self):
def is_normal(self, other):
def is_perfect(self):
def is_pgroup(self):
def is_polycyclic(self):
def is_simple(self):
def is_solvable(self):
def is_subgroup(self, other):
def is_supersolvable(self):
def non_fixed_points(self):
def fixed_points(self):
def is_transitive(self, domain=None):
def is_primitive(self, domain=None):
def is_semi_regular(self, domain=None):
def is_regular(self, domain=None):
def normalizes(self, other):
def composition_series(self):
def derived_series(self):
def lower_central_series(self):
def molien_series(self):
def normal_subgroups(self):
def poincare_series(self, p=2, n=10):
def sylow_subgroup(self, p):
def upper_central_series(self):
Here are things a semi-intelligent mathematician can deduce from this fleshed-out skeleton, and that we might be able to export automatically:
__init__
that specifies a constructor. In other words, some combination of maps from some parameter space into the object modeled by PermutationGroup_generic
. That relationship is messy though, most of the time. Note that the GAP team took the opportunity over last summer to have an intern refactor/regularize the way they did constructors into a more “semantic” way”: essentially instead of using the elementary __init__
, they made a defconstructor
and gave it documentation, type information,… as parameters. Of course defconstructor
elaborates to a call to __init__
but the parameters can be used in the CD generation (and for static type-based optimizations later; ask Markus Pfeiffer @ St. Andrews if you are interested in the details)._gap_xxxx
and _magma_xxxx
indicate that the relevant “stuff” exists in the corresponding CASes. This is thus indicating a good place to bootstrap the alignment process between gap
and sage
, and therefore extract KPIs and generally optimize our progress. This would be best done by instrumenting at the SageObject
level, since this is where all those _other-computer-algebra-system_xxxx
methods are first located, as abstract methods.__xxxxxxx__
indicates the existence of a relation of some kind on the elements of PermutationGroup_generic
, which is a Sage Parent
. However, this information is best extracted from the categories export itself, presumably all(?) the time.is_xxxx
methods indicate the existence of a test and thus a property.Many of the deductions made above will be done in the same way for all Parent
s (at least if we go for the easiest information to grab), so that’s where the instrumentation should go. Most of that instrumentation actually makes sense to have in a CAS, beceause it exposes mathematically relevant concepts. It would simply be used by the exporter generating the Content Dictionary.
Remark: Ultimately we want to extract information from live objects. It should not be lost, however, that what we are trying to do is partly a social process (the study of this process is itself the topic of Work Package 7). Humans have built the code from which we are trying to extract information, and now we want to communicate that with other humans so they can in turn code on top of that. Those other humans are familiar with different tools. For instance the KWARC team uses MMT related tools, like MathHub, but not Sage. Presumably other CAS developers or even “plain” mathematicians will just see Sage through an interface built on top of MMT. So I would advocate that we:
Step 1. could be useful for instance if one is working in GAP and asking “How does Sage do that?”. We should be able to access Sage source code from within GAP, and it will be useful for automating some tasks.
Step 2. would be useful for students in the KWARC group, for instance, who would then be able to extract semantically richer information from a system like Sage with just verbal instructions from domain specific experts, because the data is now in MMT format. It splits the step in two: MMT extraction and semantic extraction, and requires different skills.
The process could be further accelerated, I bet, by exposing also deep sage introspection tools into MMT.
At this stage self-preservation instincts kick in and I don’t want to think deeper at this proposal from a logical standpoint.
I wish to thank Michael Kohlhase for suggestions that have improved the first draft of this post.
WP6 participants JacU (Florian Rabe), FAU (Dennis Müller, Michael Kohlhase) and UZH (Paul Olivier Dehaye) came together with members of the Weierstrass Institute for Applied Analysis and Stochastics (WIAS: Thomas Koprucki and Carsten Tabelow) for a one-week code (20. 3. – 24. 3.) sprint on the Math-in-the-Middle Content and Logic and the encoding of mathematical Models. The result of this was a significant extension of the MitM ontology (in particular for the meta-theories for Sage) and a WIAS preprint on formalizations of Models.