Pages

Tuesday, April 7, 2015

(Re)solving big problems - one petabyte at a time!

recorded on April 7, 2015
Rise - with us - to the challenge with active hope and learn how big data might help us save our planet!


 (vTap beat and beeps)
REBEKAH:    Big dreams?
CHRISTINE: Big data!
REBEKAH:    I'm Rebekah Nix;
CHRISTINE: I'm Christine Maxwell;
BOTH:           … and together we are vTapestry.
INTRODUCTION
REBEKAH:    Happy Earth Day Christine! It is definitely springtime in Texas again; it’s great to see so many folks enjoying the outdoors already. Now that I teach completely online, I miss taking students around the UTD campus to practice field sampling…  or digital curation with new aps, like iNaturalist. One of my favorite workshops was a day-long excursion to a local preserve where I set up comparisons between the traditional testing protocols and just-released mobile sensors and probes from PASCO Scientific. The Dissolved Oxygen test results were greatly improved in terms of speed and accuracy – which left more time for exploring.
After another drought-stricken summer, I’ve been thinking about how what you and I are exploring (from a technology standpoint) relates back to the real world in terms of the natural environment. I’d like to talk more about that today, with a focus on how new ways of thinking and doing can re-energize our generation and encourage our students to seek out ways to discovering new pathways for our immediate present and uncertain future. As a science educator, I’ve always felt a strong connection to the earth and am glad that technology continues to be a valuable tool in understanding our universe. But I worry about how the next generation relates…
As you know, my professional pathway has always aligned closely with my personal story. Like you, I think we grew up during the best times... when our mothers could get away with locking us outside for a few hours each day - and they did just that! While we had 'new technologies' and found them interesting, they were still novelties/gadgets back in the 60s and 70s. With today's kids spending “on average of 44.5 hours per week in front of screens” – sometimes 2+ at a time – it all seems like one big global obsession now.
CHRISTINE: I know what you mean. In the last two decades, childhood has moved indoors. The average American boy or girl spends as few as 30 minutes in unstructured outdoor play each day, and more than seven hours each day in front of an electronic screen. This shift ‘inside’ profoundly impacts the wellness of our nation’s kids. Childhood obesity rates have more than doubled the last 20 years; the United States has become the largest consumer of ADHD medications in the world; and pediatric prescriptions for antidepressants have risen precipitously. With respect to their health and development, most of our kids are missing out on an essential connection to the natural world.
As media multiplies, it's becoming much more difficult to manage kids' screen time. A few decades ago, television was the only tech distraction, now there’s smartphones, tablets and laptops — not to mention electronic games. As I learned from a study published in the journal of Computers in Human Behavior, children who spend so many hours with their eyes glued to screens, are less able to recognize emotions. Sixth graders who went five days without exposure to technology were significantly better at reading human emotions than kids who had regular access to phones, television and computers.
Did you know that – according to the American Optometric Association< – over 10 million children in the United States suffer from undetected vision problems? These often contribute to difficulties in the classroom, even in students who score 20/20 vision in a vision screening. While computer assisted training supports gains in rigorous vision therapy programs, Computer Vision Syndrome
, also referred to as Digital Eye Strain, describes a group of eye and vision-related problems that result from prolonged computer, tablet, e-reader and cell phone use. Many individuals experience eye discomfort and vision problems that increase with the amount of digital screen use.
Big Data Solutions for a Cleaner – Greener – World
On a planetary scale however, the simulations and visualizations that scientific researchers are studying on those various screens hold amazing promise for global citizens to work together to solve – in many cases continually resolve – today's serious environmental issues. Big data is definitely making a big difference in that regard and can certainly improve our understanding of the overall situation.
REBEKAH:    That’s exactly what piqued my interest in this burgeoning field that merges science, technology, the arts, and most importantly, policy. As in education, I’m excited about how big data tools and techniques can be used to ‘objectify’ the ‘messy’ problems we’d all prefer to ignore, in this case, regarding what’s happening to our planet. An MSNBC clip of shocking images of California's landmarks before and after the drought began clearly shows that our habitat is changing. It can be quite depressing… but Joanna Macy and Chris Johnstone pegged my ‘Big Dream’ for realizing the potential of ‘Big Data’ in their 2012 book Active Hope: How to Face the Mess We're in without Going Crazy. I hadn’t ever thought about the fact that:
The word hope has two different meanings. The first involves hopefulness, where our preferred outcome seems reasonably likely to happen. If we require this kind of hope before we commit ourselves to an action, our response gets blocked in areas where we don’t rate our chances too high... The second meaning is about desire… knowing what we hope for and what we’d like, or love, to take place. It is what we do with this hope that really makes the difference. Passive hope is about waiting for external agencies to bring about what we desire. Active Hope is about becoming active participants in bringing about what we hope for. Active Hope is a practice… it is something we do rather than have. It is a process we can apply to any situation.
And that’s where big data really might just help us to save our planet! Here’s an example I think we can all appreciate. MIT researchers have shown smarter programming of stoplights can cut greenhouse emissions – while reducing the frustration of sitting in traffic during rush hour! Two papers published in Transportation Science and Transportation Research describe how they combined vehicle-level data with less precise — but more comprehensive — city-level data on traffic patterns to produce better information than current systems provide. While existing programs can simulate both city-scale and driver-scale traffic behavior, the challenge in this case was integrating the two. The MIT team figured out how to reduce the amount of detail to make the computations practical, while still retaining enough specifics to make useful predictions and recommendations. I look forward to learning the results of their next trials to test the potential of the system for large-scale signal control. Hopefully Dallas and Denver are on their list!
This success touches on some of the complexities of leveraging the masses of data we’ve already collected though, doesn’t it Christine? A think tank at Purdue’s Discovery Park purports that “Big Data is enabling the next generation of interactive data analysis with real-time answers. (And that) In the near future, queries will be automatically generated for content creation on websites, to populate hot-lists or recommendations, and to provide an ad hoc analysis of the value of a data set to decide whether to store or to discard it. Scaling complex query processing techniques to terabytes while enabling interactive response times is a major open research problem today.”
CHRISTINE:         Indeed. Advances in computing power coupled with the incorporation of natural language & semantic search capabilities have made it possible to move far beyond the simple Boolean keyword searches of early research. Today, Google uses predictive search capabilities and advanced functionalities that enable the answering of ‘conversational queries’.  The crowning caveat to this however, is the challenge of scalability being handled at the same time as high levels of speed and high levels of accuracy all at the same time.  That is a very, very difficult goal to achieve. The Gartner Magic Quadrant that examines the competitive market for advanced analytics platforms shows a sparse field compared to its Magic Quadrant for Business Intelligence and analytics.  And that’s primarily because only a very few companies have created a highly scalable environment along with all their other ‘bells and whistles’.
Simply put, the challenge is to reach all of kinds of data types – in real time – in order to analyze all the relevant data found within a single, continuous view – at the same time. Analyzing it.... that’s what big data is really all about. Unless you can connect what matters to you in ways that make sense to you, it is just more data. Machine learning (a form of Artificial Intelligence) allows applications to learn from what you ask about the data, often now in your own conversational questions. The good news is that there is, available today in fact, software that is able to handle these ubiquitous ‘big data’ challenges.
REBEKAH:    I’d like to bring up another variation on this theme. Following on with our attempt to define ‘big data’ (which just keeps getting bigger and bigger), I was intrigued by a presentation on “Big Data and the Future for Ecology” that referenced “challenges at scales individuals can’t address.” Long story short, the team made the point that, collectively, ecologists “may have ‘big data’ – [but they] just aren’t using it.” Their study explored how dark ecology’s ‘dark data’ really is today… in terms of observatories, remote sensing, and Citizen Science data, for example.
In a 2013 article of the same title, those authors note that “ecologists already have big data to bolster the scientific effort – a large volume of distributed, high-value information” but question whether or not ecologists will “join the larger scientific community in global initiatives to address major scientific and societal problems by bringing their distributed data to the table and harnessing its collective power.” In a nutshell, this is how they summarized their ‘call to action’ for traditional researchers: “Ecologists collectively produce large volumes of data through diverse individual projects but lack a culture of data curation and sharing, so that ecological data are missing from the landscape of data-intensive science…”, “To fully take advantage of scientific opportunities available in the information age, ecologists must treat data as an enduring product of research and not just as a precursor to publications”, and that “Forward-thinking ecologists will organize and archive data for posterity, publicly share their data, and participate in collaborations that address large-scale questions.”
I certainly appreciate Joanna Macy’s pointing out that “Trying out a different way of thinking about our situation (meaning the sustainability of Earth) is a powerful way of strengthening our resilience and creativity.” I am very enthusiastic about the huge take-up in “citizen science”. Citizen science refers to data collection and interpretation made by science enthusiasts rather than trained scientists – or alongside trained scientists. What draws people to participate is getting involved in the activities themselves. Particularly when those activities are local projects where tangible impact is possible and benefits to the local community can be visibly seen/felt. Great examples to explore further are the long-running Global Learning and Observations to Benefit the Environment (GLOBE) program – a world-wide hands-on, primary and secondary school-based science and education program – and “zooniverse.org” – whose mission is to “make citizen science websites so that everyone can be part of the real research online.” As an example, if you are interested in the Search of Erupting Black Holes, why not help astronomers discover supermassive black holes observed by the very large array and compact array telescopes?. Just go to the Zooniverse home page, pick “the Search for Black holes” and click on the bright yellow button that says “BEGIN HUNTING”!
Another example of Big Data is the Large Hadron Collider, at the European Organisation for Nuclear Research (CERN), which has 150 million sensors and created 22 petabytes of data in 2012 (1 Petabyte = 1015 bytes). In biomedicine the Human Genome Project is determining the sequences of the three billion chemical base pairs that make up human DNA. In Earth observation there are over 200 satellites in orbit continuously collecting data about the atmosphere and the land, ocean and ice surfaces of planet Earth with pixel sizes ranging from 50cm to many tens of kilometers.
Reality Check…
REBEKAH:    Ah, I get it. That’s probably part of what Angel Hsu was talking about in an AAAS interview about Big Data and the environment. When asked in what ways he sees Big Data within that field changing or growing in the near future, he said:
I think that environmental policymakers and decision makers need to get more creative in thinking about how to generate big data for the environment. Trees can't tweet, and oceans can't create a census of species that live beneath their surface, so until then we've got to get creative about how we can generate the necessary information and knowledge to make the smart policies and decisions that aren't ad hoc. Even though we talk a lot about the need to bridge together disciplines and actors, it's still not being done well enough in environmental policy.
Big data clearly can contribute to our understanding of earth systems today. I like how Macy positions the individual in our current dilemma: “Everything we do has ripples of influence extending far beyond what we can see. When we face a problem, a single brain cell doesn’t come up with a solution, though it can participate in one. The process of thinking happens at a level higher than just individual brain cells — it happens through them. Similarly, there’s no way that we personally can fix the mess our world is in, but the process of healing and recovery at a planetary level can happen through each of us becoming aware and making micro-decisions to help save our planet for future generations.
The value of technology in addressing the environmental question really boils down to a matter of time. That's the urgency... As Macy frames it: “In agricultural societies, the year’s rhythm is counted in seasons. In the days before clocks, the sun moving across the sky gave shape to the day. Compare these natural cycles with the time intervals of modern technology, now measured in fractions of a microsecond. Life has become a race in a way that is historically unprecedented.”
Making a Big Difference
CHRISTINE: As Hsu argues, “we are still in the very early phases of developing large-scale datasets for the environment. There are some sources of big data generation for the environment. Satellites, for example, have been around for nearly half a century and are being used for all types of observation, ranging from assessing urban growth to sea level rise and deforestation. However, even these sources of 'big data' are limited in some way in their ability to answer certain environmental questions. Satellite analysis can tell, for example, how much a city has grown over a certain time period, but it doesn't tell me how much energy or water was used in that expansion.”
REBEKAH:    Right. The reason I’m so taken with The TerraMar Project is because Ghislaine Maxwell’s passion, combined with her expertise, literally has built a truly innovative technology framework that provides the latest and greatest information that can lead to meaningful insights and important action that drives policy decisions, say, by the United Nations. I’m proud that my own Teacher Development Center, through the School of Interdisciplinary Studies, was able to share her amazing story in an archived public seminar at UTD. Thirteen minutes into that presentation, she stated that, “Part of the problem here is that we don’t have enough data. We don’t have enough information on the High Seas. We don’t have enough information on the ocean…”Then she previewed the first High Seas weather report! At sixteen minutes, she said it again: “How can we make good decisions if we don’t have a lot of information? Knowledge is power. Data is key. So we need more information.” I agree with Ghislaine and accept that the time is now for seriously big data! 
CHRISTINE: Rob Foos, Director of Development for The TerraMar Project, effectively explained how a sense of urgency can help us solve environmental issues. The way he puts it is this:
Creating a sense of urgency, whether artificial or actual, is a powerful force to further any cause. I think it results from a mixture of Parkinson’s Law – work expands so as to fill the time available for its completion, the procrastinator’s favorite (wait until the last minute and it will take only a minute to do) – and FOMO, the recently popularized and surprisingly poignant slogan from Verizon’s NFL commercials – Fear of Missing Out. When provided an opportunity to take some action, whether it’s to donate, sign a pledge, buy a product – you name it – without an impending deadline, many will take the time to further evaluate the chance... which means they’ll likely not return. Introduce a sense of urgency and the concoction resulting from Parkinson’s Law and FOMO produces surprising results. The ability to delay the decision doesn’t exist, so the people who weren’t going to take any action in the first place are going to move on, but those that may take the action are now forced – whether artificially or actually – to make that decision on the spot, no more procrastinating, and now conversions improve. This is played out every day right before our very eyes. Think of any of the millions of commercials saying, “While supplies last!”, “This weekend only!”, or any number of other phrases. The supposed discount or coupon isn’t what’s getting you in the door – it’s the fear of missing out on the ability to use it.
How does this apply to the environment, and specifically, the ocean? An actual deadline is approaching, no artificial sense of urgency here. The United Nations is deliberating their sustainability priorities for the next 15 years called the Sustainable Development Goals. Replacing the successful implementation of the Millennium Development Goals, which Bill Gates lauded as the best idea for focusing the world on global poverty that he’s ever seen. This opportunity comes at a time when there’s a rare convergence of technology, environmental need, and political will – the time has never been better for meaningful change for the ocean and our planet. If we don’t take action – if we don’t make the ocean a priority now, it will be too late in 15 years. 
This very real sense of urgency is why The TerraMar Project is gaining momentum. Our mission is to build a global community to give a voice to the most ignored, least explored part of the planet – the ocean, and specifically the high seas. We are diligently advocating for the ocean’s inclusion in the United Nations Sustainable Development Goals because, at the end of the day, if the ecosystem that comprises 71% of the planet (the ocean), and specifically that part of the ocean called the high seas and international waters which are ruled by no nation but owned by all – 45% of the planet – if the ocean isn’t included in these goals, the damage done in the next 15 years may very well be irreversible or come at so significant a price that it may be politically impossible. This is the only Earth we’ve got, there’s no Planet B…"
REBEKAH:    Again I come back to John Muir’s statement that closes my PhD dissertation: “When we try to pick out anything by itself, we find it hitched to everything else in the universe.” The Connected Learning movement in K-12 education may open the door for today’s students to solve these new problems in new ways. It “advocates for broadened access to learning that is socially embedded, interest-driven, and oriented toward educational, economic, or political opportunity. Connected learning is realized when a young person is able to pursue a personal interest or passion with the support of friends and caring adults, and is in turn able to link this learning and interest to academic achievement, career success or civic engagement. This model is based on evidence that the most resilient, adaptive, and effective learning involves individual interest as well as social support to overcome adversity and provide recognition.”  
Closing
A few years ago I challenged my students to ‘teach paperless’ with me. Continued advances in cloud computing have allowed me to ‘go paperless’ in almost every area of my daily life. This year, I’m actively challenging everyone I know to take the free and easy “I Love the Ocean” pledge at The TerraMar Project website (http://theterramarproject.org). Through the efforts of The Terramar Project, one can truly amplify the voice of the ocean by BEING the voice of the ocean – with just a click! – making a difference through passion, creativity and harnessing knowledge that can help to stall climate change on a global scale. It is clear that each of us can take responsibility to do our own thing to help because it all adds up…
CHRISTINE: Joanna Macy also puts the possibilities of ‘big data’ into a realistic (IMHO) perspective:
Imagining possible futures is a surefire way to develop foresight. If we’re only interested in “facts,” we limit ourselves to looking at what has already happened, which is a bit like trying to drive a car by looking only in the rearview mirror. To avoid crashing, we need to look where we’re going. Since we can’t know for sure what will happen, we are limited to considering possibilities, based on applying a combination of experience, awareness of trends, and imagination. While experience equips us well for dealing with familiar situations, our imagination is essential in formulating creative responses to new and as yet unimagined challenges.
Dr Nix and I implore those who listen to this podcast now to think more creatively, constructively and critically than ever about how today's big data tools and analytical techniques can be leveraged to improve the quality of life for all - including planet Earth - from whatever topic fuels your passion.
REBEKAH:    Thanks for listening today.
CHRISTINE: You can find out more at vTapestry.com.

BOTH:           Bye, for now! (vTap beat and beeps)