The BIG Problem with Big Data

Over the past month, three undergraduate students: Mark Freeman (Harvard University), James McVittie (University of Toronto) and Iryna Sivak (Taras Shevchenko National University) in the Fields Undergraduate Research Program under the supervision of Professor Jianhong Wu (York University) have been studying the way information (i.e. news articles, videos, photos, etc.) propagates through the online social network Initially, a particular model of interest was proposed and was then used to predict the theoretical behaviour of the Digg network as well as the propagation of information from user to user. In most applied modelling projects, data needed to be introduced to understand whether the model assumptions were correct as well as to validate the prediction accuracy of the model. The problem that then arose was the amount of data that was available.

In most cases, statisticians and applied mathematicians hope for a large data set in order to minimize the variance of predictions and to make interferences from large samples; however, over the past couple of years, the amount of data available in fields such as genetics to the modelling of social networks has been staggering.  For this particular project, the data was gathered from the website of Kristina Lerman (University of Southern California) (link provided below) in two separate files: digg_votes and digg_friends. The digg_friends file contains over 1.7 million entries identifying which of the 71,367 users are connected with other particular users, the time at which the connection was made and the type of connection that was made (directed or mutual). The digg_votes file contains over 3 million entries identifying which users “digged” (voted) for each particular story and the time at which the vote was made. They then had to compile both datasets to understand which particular user(s) began the propagation of the information (i.e. the source), the number of steps each voted user is away from the source, the time difference between the source’s voting time and the particular user’s voting time and finally to run a least squares optimization algorithm of the model over the voting times. Thus, it appears that dealing with large data is a clumsy and time complex procedure, but this is not always the case.

Recent breakthroughs in computer science and computational statistics have found efficient ways of dealing with data with millions of entries. For example, with the use of the GPU (Graphics Processing Unit), matrix operations and calculations can be finished faster in order of magnitude compared to the normal CPU. That is, rather than performing a single calculation very quickly with the use of the CPU, the GPU allows multiple computations to be done simultaneously at a slower rate. Additionally, with the use of parallel programming, a computer can connect to a network of computers which would allow multiple processes to be performed at the same time over many CPUs. Therefore, with the use of these specialized methods and others, big data with large number of entries can be analyzed thoroughly and results can be obtained efficiently over short periods of time.

From Left to Right: Iryna Sivak, Mark Freeman, James McVittie

2013 Fields Undergraduate Summer Research Program

The Fields Institute is currently hosting the fourth Fields Undergraduate Summer Research Program. The visiting students are working on research projects with supervisors from the Institute’s principal sponsoring and affiliate universities:

Supervisor: Noriko Yui (Queen’s University)

Supervisor: Jianhong Wu (York University)

Supervisors: Ilijas Farah (York) and Bradd Hart (McMaster)

Supervisor: Matheus Grasselli (Fields Institute and McMaster University)

The students will spend the summer working on their projects in small groups, and will present their results at a mini-conference on August 21, 2013.

2012 Fields-Mitacs Undergraduate Summer Research Program

From July 3 to August 24, 2012, the Fields Institute hosted the third Fields-Mitacs Undergraduate Summer Research Program. On the first day of the program, the twenty students, visiting the Institute from across the globe, were introduced to the supervisors and learned about the potential research projects for the program:

Supervisors: Megumi Harada (McMaster) and Jessie Yang (Toronto and McMaster)

Supervisors: Nicholas Hoell (Toronto) and Adrian Nachman (Toronto)

Supervisors: Bei Chen (McMaster) and Matheus Grasselli (McMaster)

Supervisors: Ilijas Farah (York) and Bradd Hart (McMaster)

Over the summer, the students worked on their projects in groups with their supervisors, gaining insight and experience in the world of mathematics research. They presented the results of their work at a mini-conference on August 22, 2012.

Interview: Vishal

Zero was invented by Indian mathematicians. Vishal discovered this fact in his youth and it became the driving force for his mathematical journey. It also provided him a way to explore his national pride. Vishal came to Canada from Trinidad two years ago and although he grew up in the Caribbean, he embraces his Indian ethnicity and his Hindu religion.

Prior to the program Vishal learned many software languages on his own. LaTeX and Maple are just a couple, however Vishal is constantly searching for ways to apply mathematical theory. He believes communication to be an essential ingredient for mathematicians, but notes that, “it depends on the problem whether it is better to solve it on your own or in a group”.

Glaucoma patients suffer from increased pressure in the eye, which can cause complete blindness. Patients cannot be cured and need regular medication. Vishal enjoys researching the bio-mechanics of glaucoma, mainly because of the utility of models. In particular, the group is exploring the behaviour of the eye fluid, its pressure, and movements using techniques from fluid mechanics. An implication of the model’s solution includes the determination of the underlying causes of glaucoma. This fundamental understanding will eventually lead to better medications and improved methods of treatment.

“Math is the base of all sciences”, says Vishal. This quote illustrates his perspective as both respectful and philosophical.

Interview conducted by Mariya Boyko

Interview: Anna, Ferenc, and Zoltán

Hungary has a strong culture of mathematics. It also has a rich history of mathematicians including figures such as Lovász László and Erdős Pál. Interestingly, mathematics is popular in Hungary and many opportunities exist for mathematics graduates. We have been given a generous view into Hungarian mathematical culture by Anna, Ferenc, and Zoltán.

Anna and some friends recently created a new competition for students in grades 6 to 12. They did this after noticing children who weren’t indulging in exciting math, mostly from a specific region of the country. “The competition is fun and not as serious as an Olympiad… If children can be reached when they are young they will like math more”, said Anna. Various other math competitions and camps are organized around Hungary, but urban cities have more mathematical activity for children. While in high school, Zoltán won many competitions. Ferenc is from a small town close to Szeged and children in such towns do not always have access to Hungary’s major competitions.

In terms of group research style, Zoltán and Ferenc take the position that independent thinking is of critical value in collaboration. In contrast, Anna places higher value in planning and communication. They agree that unhealthy competition ruins professional relationships and can be counterproductive.

The three Hungarians want to teach in the future and are always looking for ways to meaningfully improve the Hungarian education system.

Interview conducted by Mariya Boyko

Interview: Louis-Philippe and Nigel

Louis-Philippe and Nigel are part of the Model Theory of Operators group supervised by Professors Brad Hart and Ilijas Farah. Nigel and Louis-Philippe have established an excellent professional relationship. Intellectually they are well rounded with a sharp sense of humour and great ambitions for their future academic careers.

Nigel and Louis-Philippe were pleasantly surprised by the daily routine that was established amongst the Fields-MITACS students. Upon arrival Louis assumed that the students will have to be ‘focused on math only.’ As several weeks passed his assumption was proven wrong. The Fields-MITACS group lives a very active and balanced life style. Nigel appreciates the flexible schedule and believes that, “the environment impacts the way we think.” Taking regular breaks filled with physical activity and friendly interactions increase Nigel’s and Louis-Philippe’s research productivity.

Louis-Philippe and Nigel, behind the Fields Institute

Louis-Philippe recently graduated from the University of Montreal. His long term goal is to become a professor of mathematics. As for now, he is striving to finish a one-year Masters Degree at the University of Toronto. Louis-Philippe is aware of the challenges that might arise on the way of a young mathematician but he is not willing to give up his dream.

Nigel is going into the third year of a Mathematics Program at McMaster University. His interests span multiple academic disciplines. Nigel enjoys discovering mathematics through other disciplines such as medicine and logic. With such a wide erudition it is hard for him to pinpoint the field in which he would be willing to get into after graduating.

Nigel and Louis-Philippe have never been a part of a group research project before, but they quickly discovered that the key to productive collaboration is the ability to take criticism and to be attentive to each other’s ideas. “We wouldn’t get very far without [accepting] criticism”, said Nigel.

Interview conducted by Mariya Boyko

Interview: Luke

The oldest of ten children in the family, Luke was born in a small rural community in Alberta. He was home schooled by his mother and grandfather, then took eight correspondence courses at Athabaskan University and transferred to the University of Calgary. Luke is going into his fourth year.

He has refined his views  on the impact of home schooling on social development, the  essential and necessary conditions for optimal collaboration, as  well as a theory to explain the shortage of female mathematicians in academia.

It is not common to find a home schooled student in the GTA but  such a practice is widely used in rural areas. It is often argued that home schooled children are less social than their peers form public schools. Luke stated that he has never seen any evidence to back up this stereotype. He had a wide range of friends both home schooled and public schooled. There were very social children and very shy children in both groups. Moreover, it was impossible to say that one group had more non social children than the other.

Even though Fields-MITACS is the first group research project for Luke, he is well prepared for this research style and the possible difficulties that can arise in collaborative work. He has observed group  research projects at the Institute for the Quantum Information Science where researchers needed to collect data and to analyze if it can be representative of a real life situation. Luke learned that common scientific interests are necessary for a successful collaboration but some degree of personal compatibility is needed as well.

Luke realizes that males compose the majority of mathematicians. Luke`s theory is that young girls sustain the stereotype that boys are good at math and girls are good at creative writing and reading. His sister is interested in sciences but none of her friends are interested in it. As a result, she avoids having meaningful conversations with them because she is scared to be labeled as a nerd. Over time his sister felt discouraged and spent a much less time pursuing her scientific interests.

Interview conducted by Mariya Boyko

Interview: Maximilian

Max is at the dawn of his mathematical career. He has already interacted with many professors, displaying optimism and enthusiastic personality. Max contemplates over the possibility to stay in academia and to teach at the university level. His views on teaching are fully formed. In particular, the teaching styles he enjoys most are those which exude a high subject proficiency. His favourite professors focus on conceptual knowledge and rigorous concepts.

He has just finished his second year at the University of Toronto, so he certainly has a lot of time to find his academic passion. Max is determined to learn new skills. Long before the second semester was over he started looking for various summer projects that would provide him with valuable research experience. The MITACS summer project looked like a perfect opportunity.

Max appreciates the diversity of mathematical abstraction. When asked to name his favourite field of math, Max admitted that the abundance of “very interesting topics make it hard to choose”.

Max is a member of the Symmetries of Euclidian Tesselations and Their Covers project supervised by Isabel Hubard, Mark Mixer, Daniel Pellicer and Asia Ivic Weiss.  There is a lot of independent work involved in this project. Max enjoys the ‘atmosphere’ of his office. It helps all the group members work productively. Max is studying the algebra of tessellations. Quotients of tessellations are another focus of his work. Such work demands a great deal of individual attention but he emphasizes that collaboration among colleagues is essential to be able to consider different views on the problem.

Interview conducted by Mariya Boyko

Interview: Nikita

Nikita Reymer is a member of the Mathematical Finance group. Nikita was born in Russia, but also lived in Ukraine. He came to Canada when he was in grade seven. Recently he obtained a specialist honours bachelors degree in actuarial science from the University of Toronto and is planning to go to graduate school.

Upon arrival to Toronto Nikita was amazed by the diversity of cultures that co-exist in the city. The impression was so strong that he still speaks of it with visible excitement. Nikita also noticed that school textbooks in Russia and Ukraine were very different from Canadian textbooks. “There are no games in Russian textbooks”, said Nikita. In Canadian textbooks there are more illustrations, examples, and applications. In contrast, Russian textbooks present information in a laconic way with fewer illustrations, but with more rigor.

When asked to describe his current project, Nikita exclaimed: “This is the best project I ever participated in!” Students were given an opportunity to explore real data to predict financial crises. Supervisors meet with the students twice a week. They use a hands-on approach to the material, which gives students the freedom to explore topics that interest them. Work was divided equally among team members. Each student presented their assigned articles.

Prior to the MITACS program Nikita held an NSERC USRA in the summer of 2009. He worked on Group Representation Theory with Professor Repka.

Nikita thinks that collaboration is essential for mathematicians. More people can solve a problem faster. There is a saying that a lot of Russian people use: “having one head is good, but having two is much better”, meaning that it is more effective to involve several people in problem solving.

Interview conducted by Mariya Boyko


Interview: Fernando, Lucas, and Rafael

Fernando Lenarduzzi, Lucas Bentivenha, and Rafael Rocha are visiting from Brazil. All of them come from the Universidade Estadual Paulista “Julio de Mesquita Filho”.

(L to R) Lucas, Rafael, and Fernando on Spiral Staircase

Along with other student researchers, Fernando works with the Symmetries of Euclidean Tessellations and Their Covers group supervised by Isabel Hubard, Mark Mixer, Daniel Pellicer, and Asia Ivic Weiss.

Lucas and Rafael are members of the Mathematical Finance group supervised by Matheus Grasselli and Oleksandr Romanko.

Upon arrival to Toronto they were impressed by the politeness of the local drivers and the absence of traffic. Fernando said that “compared to Rio, Toronto does not have traffic problems at all.” Rafael was surprised to see that such a heavily urbanized city as Toronto is full of trees and parks. Lucas pointed out that Rio and Sao Paolo are very similar to Toronto

Fernando’s description of the Brazilian education system explained the reasons behind their strong mathematical background. Brazillian students need to choose a specialised high school after grade nine. Their options include mathematics, arts, and military. These schools prepare the students for university by focusing on specific subjects.

In order to get into a university, one needs to write a standardized test. The students who succeed on the test have a chance to enter public universities. The students who do not do well can go to private universities where the quality of education is lower. Fernando, Lucas and Rafael all managed to get into public universities.

They mentioned that it is not uncommon for Brazilian students to use textbooks in foreign languages such as Spanish, French, or English. “One only needs to learn the basics of the language. Math is international”, said Lucas. Often multiple textbooks are used in a single course.

Fernando had a chance to work with a research group before, but for Lucas and Rafael this is their first group project. Both of them are amazed to see the math behind the financial crisis. “Even highly educated people sometimes say that financial crisis is unpredictable. It is amazing to learn that we can stay in control”, said Lucas.

Interview conducted by Mariya Boyko

