Pages

Friday, November 7, 2014

Why Big Dreamers Need to Know About Big Data

recorded on November 7, 2014
Meet your Big Dreams? Big Data! co-hosts and learn what ‘big data’ means to them
– and, more importantly, why it matters to you!

 (vTap beat and beeps)
REBEKAH:    Big dreams?
CHRISTINE: Big data!
REBEKAH:    I'm Rebekah Nix;
CHRISTINE: I'm Christine Maxwell;
BOTH:           … and together we are vTapestry.

INTRODUCTION

REBEKAH:    Welcome to our creative disturbance channel. From pixels to waves to real-time streams, artists and scientists working on big ideas need to be able to work with massive, highly varied, digital datasets –– without compromising their projects due to limited network capabilities. This channel will encourage ‘early adopters’ to extend human networks to push the edge of the latest technological breakthroughs even further.
CHRISTINE: We'll explore keywords and concepts like the semantic and synaptic webs, predictive analytics, virtual consolidation, Internet Protocol version 6, the Internet of Things, iterative discovery, digital integrity, 3-D visualization, petabyte club, gigapop pipe, and much more from a practical view. Our goal is to 'translate' the techno-jargon of the engineers and computer scientists (who create the tools and technologies) into a 'story' or 'case' for the ultimate end-users and information consumers – people like us!

Who We Are

REBEKAH:    That reminds me of another new term that I think describes our goal with this channel. Have you heard of ‘antidisciplinary’ work? It was listed as a job requirement for a faculty position in the MIT Media Lab! Director Joi Ito explained it this way:
"Interdisciplinary work is when people from different disciplines work together. An antidisciplinary project isn’t a sum of a bunch of disciplines but something entirely new – the word defies easy definition. But what it means to me is someone or something that doesn’t fit within [a] traditional academic discipline – a field of study with its own particular words, frameworks, and methods." (SOURCE = http://thegovlab.org/antidisciplinary/)
To me, that's a good definition for a ‘creative disturbance’!
CHRISTINE: Actually, that's why it’s no surprise that you and I met through the School of Interdisciplinary Studies at UT Dallas! As you know, I started out with a position relating to eLearning and searching out innovative online learning efforts across campus. Now focused on learning technologies, I am working to enhance our physical infrastructure so it can support the ‘Big Dreams’ of faculty, staff, and students.
In the early 80s, I was very involved with online information access and retrieval, and I acquired a small information broker company based in Berkeley, California. There, I was introduced to the magic and limitations of Boolean logic for online research. Roger Summit had created a company called Dialog, which, in many ways, was the ‘Google’ of the late 70s and early 80s. Dialog’s unique contribution to information access was offering massive, consistently formatted databases and a (Boolean) search language that provided flexible, yet precise, search capabilities. That actually allowed cross-database searching for the very first time.
And, as they say, ‘the rest is history’. It’s an exciting time to be at this intersection of the arts, sciences, and technology.
REBEKAH:    We are definitely living ‘in interesting times’ and our interests do overlap in amazing ways.
My formal training in Geosciences led me to take map-making to the ends of the earth, literally, with satellite imagery and remote sensing. I have always loved the colors and the shapes of nature, as well as the opportunities for photography that being out in the field afforded. My knack for manipulating computers (really a lack of fear of those dreaded early machines) landed me in high-tech start-ups, pioneering videoconferencing and large-scale government applications. As a technical writer who loves to try new things, I became the liaison between the hardware and software engineers, and then on to explain all of that side to the marketing and sales teams. After surviving the dot.com boom, I decided to bring that expertise back into my favored field of science education by creating technology-enriched learning environments.
So, I very much enjoy our varied conversations about how we can help make a positive difference through UT Dallas connections! We’ve already had the privilege of working together on various projects for several years now and we're both rather creative individuals, who are disturbed by some of the things that are going on around us with regard to the advent of this thing called 'big data'. It's not just about having a lot of data, but also involves the handling of different types of data in new ways, right Christine?

What Exactly Is Big Data?

CHRISTINE: Absolutely. The thing I find disconcerting about notion of ‘big’ is not easy to pin down in any context. In fact, there is no agreement on the definition of what constitutes ‘Big Data’. In my mind, it is clear however, that Big Data isn’t just about size. Research shows that the variety of data is by far the greatest challenge to solve. The National Institute of Standards and Technology (NIST), argues that Big Data is data which exceeds the capacity or capability of current or conventional methods and systems. But there are other organizations that point out that large datasets are not always complex, and small data sets are not always simple. Look at Microsoft’s definition of Big Data, for example, Microsoft provides a quite specific definition, stating that “Big Data is the term increasingly used to describe the process of applying serious computing power, the latest in machine learning and artificial intelligence, to seriously massive and often highly complex information.”
(SOURCE = http://www.adambarker.org/papers/bigdata_definition.pdf)
I think what’s important here, in defining ‘big data’, is the concept of there being a set of related technologies and terms which should and do form critical parts of the definition. What all these elements point to, as I see it in action, is that big data is intrinsically related to data analytics and the discovery of meaning from data.
REBEKAH:    I remember how you introduced the vastness of ‘big data’ to me. You pointed out that usually when you talk to people about “big data”, a very analytic type of data (typically numbers) pops into their head first. This is in fact where big data got its start. It was all about numeric information. It was transactional, machine-generated information that was manipulated to find correlations and trends that were meaningful to highly-skilled data analysts.
But, the fact is that about 80% of all data available falls into a whole different realm... you called it ‘everything else data’. That’s not the numeric information. It’s not the kind of thing that you can put into a spreadsheet and run an algorithm on. It’s information in the form of written content such as emails. It’s travel information. It’s Facebook and Twitter postings. It’s all unstructured content – like what we view everyday on the Web.
CHRISTINE: Yeah, that’s very well said and I think what’s also interesting about this ‘everything else data’ is that 10 years ago we didn’t know it would exist to be collected because apps like Facebook didn’t exist. We didn’t have smart phones or Twitter either! All of a sudden these new information sources that we didn’t anticipate having access to, are not only popping up, but are downgrading the level of importance of the old content delivery channels forever – and at phenomenal rates.
The challenge now is about how we can deal with both the analytic, numeric side AND all this other type of unstructured, visual data as well, that is highly distributed. That’s what we’ll explore on this creative disturbance channel.

Who Are Some Successful Big Dreamers?

REBEKAH:    Being one of those big picture thinkers, I've always been frustrated by the limitations, the constrictions, of traditional research, but now I'm absolutely awed with the ability to visualize and manipulate big data in ways that have never been even really imagined before. My broad background has helped me keep an open mind – not just open, but an interested mind – so I think the curiosity of artists and what 'normal' people call idiosyncrasies, is what I see as exciting about this convergence of technology and science and the arts and human creativity. The machine learning is another tool that we can use to our advantage if we stretch our minds and open our thoughts to new ways of working and playing and making new views of new things. ‘creative disturbance’ is a fantastic name for what's happening right now across all of these areas.
CHRISTINE: I think giving a few examples will help to whet our listeners’ appetites.
Dr Maximilan Schich is an art historian in ATEC at UT Dallas. His research focuses on new ways of understanding historical data through the convergence of physics, computer science, information visualization and hermeneutics. The key to his recent work is the ability to gain access to and manipulate big data in new and creative ways. He has managed to create and quantify a big picture of European and North American cultural history through a reconstruction of the migration and mobility patterns of more than 150,000 notable individuals over a time span of 2,000 years. By connecting the birth and death locations of each individual, Schich and his team have made progress in our understanding of large-scale cultural dynamics. (See more at http://www.sciencemag.org/content/345/6196/558)
Dr David Lary is a physicist working with big data [in EPPS] at UT Dallas. One area of his research involves combining plant biology with massive data from Earth-orbiting satellites to help inform the conversation about a critical economic and societal issue: water resources management. Texas’ population for example, is expected to double within thirty years. So without much more intelligent use of existing infrastructure, the chances of there being enough water to support such population growth, is really slim. Luckily there is an existing infrastructure that can enable state-wide efficiencies to be made in irrigation control. And here is where Big Data Analytics and Machine Learning enter.
Dr Lary uses existing NASA satellite data to track water usage by measuring how much light is reflected from the Earth’s surface in each of three light wave lengths. This allows measurement of the amount of chlorophyll absorption – and by inference – where plants are not doing so well for lack of water. In the near future, municipalities using such techniques could deploy a weekly remote sensing inspection to help identify areas of overwatering or burst pipes, as well as to optimize irrigation patterns and automated sprinkler systems​. Dr Lary has developed an innovative approach to bringing about significantly greater water use efficiency – and Texas municipalities are already listening. (See more at http://www.utdallas.edu/news/2014/3/26-29251_Professor-Uses-Satellite-Data-Plant-Biology-to-Tra_story-wide.html)
REBEKAH:    And then there’s a dynamic duo of Dr Ellen Wagner and Beth Davis [of Sage Road Solutions]. With impressive help from the Gates Foundation starting in 2011, they are the guiding lights behind the creation of the PAR Framework. The Predictive Analytics Reporting Framework is a multi-institutional data mining collaborative that brings together 2 year, 4 year, public, proprietary, traditional, and progressive institutions to collaborate on identifying points of student loss and to find effective practices that improve student retention in US higher education. Their successful Proof of Concept pilot program demonstrated that applying predictive analytical methods to multi-institutional, federated data sets helps identify common, as well as institutional, points of risks and points of opportunity for proactively improving student success – and they’re continuously discovering new ways of looking at ‘big data’ today. (See more at http://www.parframework.org/about-par/overview/)

Why Co-Hosts? or How We Might Help You Connect Solutions!

Designing and delivering – and assessing and evaluating – my first online course was quite a challenge back in the 90s. Now we're starting to earn digital badges – or alternative credentials – that we choose ourselves for our own unique needs. Most of the world is awakening to the fact that technology has changed not just the way we play but the way we work... especially since we have the opportunity to learn from each other; that's where our teaming is important and makes a difference. That's why it takes both of us to do the work that we're doing these days – not that we're not capable, but because it's so much richer with these merged perspectives, these unique takes on things, and the complementary strengths. That's what we will be bringing to these discussions on a regular basis.
CHRISTINE: And finally, what I want to add is that what we want to do is to share the thoughts and ideas that arise from our weaving other ideas that we get from the members of our individual personal and professional networks to inspire other thinking, collaborating. It's almost impossible to separate work from play or teaching from learning anymore. It's just how we are and I think that's what allows the creativity to be so much more evident and urgent. By pairing our views we're able to critically discern and comment on things in a constructive way. So we'll be bringing a lot of different ideas from many people and place them in the context of big data and how that in turn influences and impacts on the sciences, the arts, and even technology.

CLOSING

REBEKAH:    We decided to co-host this site, because that's what we do. Isaac Asimov brilliantly explained why ‘two heads are better than one’ in some cases, like this one. In a published essay, called On Creativity, written in 1959 for a government research group, he noted:
“No two people exactly duplicate each other’s mental stores of items.
One person may know A and not B, another may know B and not A, and either knowing A and B, both may get the idea – though not necessarily at once or even soon.
Furthermore, the information may not [only] be of individual items A and B, but even of combinations such as A-B, which in themselves are not significant. However, if one person mentions the unusual combination of A-B and another unusual combination A-C, it may well be that the combination A-B-C, which neither has thought of separately, may yield an answer.” (SOURCE = http://www.technologyreview.com/view/531911/isaac-asimov-asks-how-do-people-get-new-ideas/)
CHRISTINE: Indeed, those places where our interests and experiences overlap is where we have really deep and meaningful conversations that inspire constructive and critical and creative thinking in each of our own works, so we do encourage each of you listening to reach out to others and to find those common spaces and explore the possibilities. Look for opportunities to leverage new technologies and especially big data as it is going to change the ways of academia and social realms in ways that we can't yet all foresee. That’s why we’ll be looking at and learning about this ‘big deal’ for quite a while.
We hope you’ll join us.
REBEKAH:    Thanks for listening today.
CHRISTINE: You can find out more at vTapestry.com.
BOTH:           Bye, for now!
(vTap beat and beeps)