Monthly Archives: April 2012

Sampling Methods: Harvard Institute of Politics Innovative Way of Eliciting The Relative Importance and How to Write Reports from Sample Surveys – Also Sample Survey Presentation Deck

To understand how Harvard Institute of Politics is doing panel survey using an innovative way of collecting relative information read below.

Though I started this note thinking that this is for general reading, I think the

details in the analysis plan and how it was summarized from sampling method, population definition, and report writing, suggests that this is a must read for academic
training and professional followers. Some of the methodological issues are not anything new but the innovative way they extracted the relative importance of issues deserves careful attention.

Some important high-lights from academic training point of view for predictive analytics scientists:

Learn to write like this for your planned survey:
The relative comparison of issues is beautifully captured here:

Think like the following for supporting analysis of results: and report writing:

Here is the deck I presented on 2/20/14

Week7_Survey Methodology and Evil Twin of Bias_V2

From Data Monster & Insight Monster

Could BIG Data Save These Huge Companies? Towards Trillion Dollars Destruction and Reconstruction – The case of Sony

In the earlier note we saw how the consumer reach and marketing model is affecting Best Buy and how potentially this can be a great reforming opportunity so that it can leverage the new trends in lifestyle to become the new leader of BIG data driven market reach.

For more caution and challenges, see Why Best Buy is Going out of Business…Gradually

Though I am seeing it as an opportunity, the whole write up can be taken as caution if it is acted on sooner, rather than gloom and doom. The article is brilliant, and my interest is to see what prescriptive approach can help the company, in the context of new consumer lifestyle and technology adoption trajectories and also the tools and methods of consumer reach that leverages their consumption behavior. From that perspective, I would say the mobile and BIG data are the action facilitators for a new renewal and Bestbuy can certainly benefit immensely. Bestbuy culture of treating the consumers as a resource to understand the market pulse is awkward at best, and destructive as pointed out by the authors. The traditional best practices of consumer engagement, communications, and support is still paramount for a company to be competitive.

The next company that is getting thrashed by the innovations of its competition and discordant alignment of consumer lifestyle changes and technology adoption is one of my favorite teen age idol, the consumer electronics company, Sony Corporation.

See: Sayonara Sony: How Industrial, MBA-Style Leadership Killed a Once Great Company

From Data Monster & Insight Monster

BIG data is BIG Business – The first BIG data stock, Splunk Inc. doubled on the first day – Market Value 3 Billion Plus – What are Next?

BIG data – Is it the next BIG revolution next to the industrial revolution? or is it the next internet revolution?  Splunk Financials:


Read:  What Splunk IPO Says about BigData

BBC interview that includes Splunk VP.

Key points in the audio that are touched on:

  • First time we will be able to do data mining of non-transaction data which is 80% of interactions, communications, and engagements, broadly called non-standard data
  • Computing costs are coming down and we are able to mine these non-standard data much easily
  • Consumers might have reasons because there is more sensitivity in prediction (true positives) and we have to trust corporations and governments will behave responsibly
  • How will it help the consumers – Fridge will order your needs and it will get delivered on time (on corporate side) and anonymized data could be analyzed for faster and better public solutions such as development of medicine and securing personal freedom!
  • Integration of web/smartphone/analytics is key to this

So who could be The Next great companies that are likely to be stock market rock stars of BIG data.

  • Cloudera
  • Couchbase is a collection of companies including 10gen
  • Hortenworks
  • Sumologic
  • Metamarkets
  • Infobright

Updated: 5/5/2012

Watch also this one, which is already profitable and has ways to grow when we look at SPLK.

DWCH – Datawatch

    From Data Monster & Insight Monster

    Analytics degree programs – After Brown University’s New Executive MS, now NYU Masters in Business Analytics

    More and more universities are starting Analytics degree program. Not all programs are the same; in fact, each one seems to be pretty unique.

    New addition:  Chicago part time program for MS in Analytics. 

    The key difference I see here is that, Chicago has a required course on “Research Design for Business Applications”, as one of the 7 foundation courses.

    Marketing analytics, survey research design, Financial analytics are considered as electives for a total of 11 courses and a capstone project.


     Chicago has another school, IIT (Illinois Institute of Technology) offering MS in Marketing Analytics and Communications (calls it as) MS-MAC program.  This is a regular day program. For more look into

    On the other hand, University of Cincinnati, Lindner College of Business offers Ms in Business Analytics program, which is offered both as full-time and part-time program.  Check it out here

    New Addition: September 6, 2012

    Michigan State University, Broad College of Business in association with College of Engineering and College of Natural Sciences is offering a new program in MS in Business Analytics.

    Program is designed to be completed in 3 semester based one year on campus studies.

    The courses include from the site are:

    ” …
    Spring Semester
    ITM 818 Introduction to Business Analytics (3 cr)
    How digitized business processes and data analytics are essential to the performance and competitive advantage of a modern corporation. Different approaches for strategic data management and business analytics. Real-world cases of successes and failures with analytics-based business strategies.
    ITM 822 Project Management (3 cr)
    Management of information systems projects.  Modeling of business processes. Management of project scope, time, and costs. Planning and control of projects. Program and portfolio management. Consulting issues for effective project management.
    CSE 891 Computational Techniques for Large-Scale Data Analysis (3 cr)
    Emerging issues in big data (e.g., collection, warehousing, preprocessing and querying; mining, cluster analysis, association analytics; MapReduce, Hadoop; out-of-core, online, sampling-based, and approximate learning algorithms; model evaluation and applications, etc.).
    Recommended background: CSE 232 or permission of instructor.
    MGT 805 Communications Strategies for Analytics (3 cr)
    Development of managerial level business communication skills. Communication strategy development in oral and written form.
    Summer Semester
    STT 863 Applied Statistics Methods (3 cr)
    Application of regression models including simple and multiple regression, model diagnostics, model selection, one and two-way analysis of variance, mixed effects models, randomized block designs, and logistic regression.
    Recommended background: STT 442 or STT 862; MTH 415 or concurrently.
    MKT 829 Marketing Technology and Analytics (3 cr)
    The collection and analysis of information from the web, including web-based surveys, web analytics, online communities, blog scraping, and web spiders to support marketing planning and performance. Online
    STT 890 Statistical Problems (3 cr)
    Individualized study on selected problems.
    Or AEC 891 Topics In Agricultural Economics (3 cr)Or CSE 890 Independent Study (3 cr)
    Or any 890-891 independent study/topics course at MSU for analyzing big data problems
    Fall Semester
    CSE 881 Data Mining (3 cr)
    Techniques and algorithms for knowledge discovery in databases, from data preprocessing and transformation to model validation and post-processing.
    Recommended background: Programming skills in C, C++, Java, and Matlab. Basic knowledge in calculus, probability and statistics.
    MKT 865 Emerging Topics in Business (3 cr)
    Perspectives on new and emerging issues of business administration.  Topics vary including content resource management systems, data mars, software meeting commercial standards, and a firm project related to big data.
    ITM 888 Capstone: Business Analytics (3 cr)
    Practicum in the development and delivery of predictive data analysis for strategic decision making in organizations.  Application of the principles and tools of analytics to real-world problems in R&D, marketing, supply chain, accounting, finance and human resources management.  Development and presentation of  analytical
    insights and recommendations.
    MGT 805 Ethics and Intellectual Property Issues (1.5 cr)
    Legal, ethical, and intellectual property issues related to big data analytics.
    GPA Requirements. Students must maintain a cumulative grade-point average of 3.0 or higher in all graduate courses.

    Also, “…Learn more about admission requirements and how to apply.  View the brochure for more information.  Specific questions can be addressed to“.

    New addition:  Starting Summer 2013

    Oakland University, Michigan, USA – the main claim is you can complete in less than a year and it mixes half online with half the time on campus.

    Program Toolbox and Skills

    Unified Modeling Language Tools
    Microsoft Visual Studio 2010
    Microsoft SQL Server
    SQL Server Reporting Services
    SQL Server Analysis Services
    SQL Server Master Data Services
    Palisade DecisionTools Suite
    Arena Simulation Software
    Microsoft Excel + VBA
    Excel Solver & XLMiner
    JMP Statistical Discovery
    TreePlan Toolkit
    Microsoft Project

    Degree: Master of Science in IT Management – Business Analytics

    Earn a Master’s degree in as little as a year.
    A 1-Year,  Cutting Edge, Focused Program in Business Analytics

    • Half On-Line
    • Half @ Oakland University
    • All In-State Resident Tuition Rates

    More details available here

    Updated Q1 2012:

    Here is a list, with out any priorities attached; the first part significantly follows the kdnuggets notes in terms of structure and notes.

    However, I will be adding my review of the course contents.

    NY University, Stern School of Business, interestingly it will be a strong program for executives from NY – Northeast – USA,  and China

    Bentley University Business Analytics programs: MBA with Analytics focus, MS in Information Technology with Analytics Focus, PhD in Business with Analytics Concentration. Waltham, MA.

    Central Connecticut State University (CCSU), offering MS in Data Mining. New Britain, CT.
    CMU Program in Knowledge Discovery and Data Mining at CMU Center for Automated Learning and Discovery. Pittsburgh, PA.

    Drexel U. MS in Business Analytics, at Lebow college of business.

    Penn State Data Mining Certificate program, Pennsylvania, PA.

    Saint Joseph’s University MS in Business Intelligence, for business professionals. Philadelphia, PA.

    Stevens Institute of Technology Master of Science – Business Intelligence & Analytics, Hoboken NJ

    Also read: Five things a modern predictive analyst need to be good at 

    College of Charleston Undergraduate Discovery Informatics Program, new interdisciplinary programm which integrates computer science and mathematics with specific application disciplines. Charleston, SC.

    George Mason U. Computational Statistics in the Data Sciences Program. Fairfax, Virginia.
    George Mason U. Graduate Certificate Program in Data Mining. Fairfax, Virginia.
    NC State Master of Science in Analytics, Raleigh, NC.

    Oklahoma State University offers 2 certificates in Business Analytics and Business Data Mining for training management or technical professionals in the field of analytics and data mining using SAS tools such as SAS EG, SAS EM and JMP. Offered online and on-campus. Stillwater, OK.
    U. of Central Florida (UCF), Dept. of Statistics, offering a Data Mining Certificate Program and Master’s degree in Data Mining. Orlando, FL.

    University of Houston – Clear Lake Financial Data Mining course, by Prof. Gary D.

    U. of Louisville Certificate in Data Mining, jointly offered by Computer Science and Math Depts for training professionals in the interdisciplinary field of data mining. Louisville, KY.

    U. of Tennesee Master in Business Analytics, Knoxville, TN.


    DePaul University Center for Data Mining and Predictive Analytics, offering MS in Predictive Analytics, Chicago, IL.

    Northwestern University Master of Science in Analytics, Evanston, IL.
    See also Northwestern U. School of Continuing Studies Master of Science in Predictive Analytics, online program.

    University of Denver MS in Business Intelligence at Daniel’s School of Business, Denver, CO, USA.

    UIUC (U. of Illinois at Urbana-Champaign) Data Sciences Summer Institute, Urbana-Champaign, IL.


    Stanford Center for Professional Education, offers Data Mining and Applications certificate program for managers and professionals. Stanford, CA.

    UCI M.S. Program in Knowledge Discovery in Data. Irvine, CA.

    UCSD Data Mining certificate program, San Diego, CA.

    U. of California Santa Cruz graduate level data mining course, offered for professionals and graduate students at two locations: UCSC Silicon Valley Center and Santa Cruz.

    for More details that also includes Canadian schools, See
    Most of the information has been sourced from with added latest information.  However, I plan to review each of the program in detail.

    July 2012:  Today, I am reviewing the Northwestern University, fully online globally accessible program.

    The key training modules needed are,

    CIS 317-DL Database Systems Design & Implementation
    CIS 435-DL Data Warehousing and Data Mining
    401 DL – PREDICT 401-DL ( Core Course )
    Introduction to Statistical Analysis 
    PREDICT 402-DL ( Core Course )
    Introduction to Predictive Analytics & Data Collection
    PREDICT 410-DL ( Core Course )
    Predictive Modeling I
    PREDICT 411-DL ( Core Course )
    Predictive Modeling II
    PREDICT 412-DL ( Elective )
    Advanced Modeling Techniques
    PREDICT 450-DL ( Elective )
    Marketing Analytics  
    PREDICT 451-DL ( Elective )
    Risk Analytics
    PREDICT 453-DL ( Elective )
    Text Analytics
    PREDICT 475-DL ( Core Course )
    Project Management
    LEADERS 481-DL ( Core Course )
    Foundations of Leadership 

    PREDICT 498-DL ( Final Project )

    This is a global program and any one with the right background can benefit from this program.

    Northwestern University site ( )
    reads:  “Students in the MSPA program will learn how to harness the power of new data sources to influence high-level decisions and, as leaders with analytics expertise, to implement more effective strategies and solutions. The MSPA program’s convenient online format provides comprehensive instruction in advanced analytics — quantitative analysis, database administration, predictive modeling and more.
    The program consists of 11 courses, each 10 weeks in length. The program can be completed in as little as six quarters (1.5 years), but most students graduate after two to three years of study. Students can start courses in any one of four terms during the year.
    Visit our admissions page for more information.
    Working professionals with a solid mathematical foundation and an interest in quantitative analysis are encouraged to apply to the MSPA program. Our analytics courses adopt a conceptual approach that makes them accessible to students from a wide variety of professional and educational backgrounds. The program also includes opportunities for practical application of conceptual learning through case studies in five specific industries. The program concludes with an individual capstone project or course that allows students to demonstrate their knowledge and personal interests.


    The MSPA curriculum provides students with exposure to enterprise-ready database programs such as Oracle and Postgres, and the top statistical modeling software including IBM SPSS, SAS, and R. Graduates can leverage this multilingual training when seeking careers across industries.
    The MPSA program differs from other analytics programs by grounding students not only in analytics and its practical application, but also by helping students develop the critical communication skills that will enable them to convey the value of data-based decision-making and the meaning of complex analytic results.
    The MSPA program also includes Northwestern University School of Continuing Studies’ 10-week leadership course designed to equip students with the leadership techniques and skills necessary to direct data-driven decision making processes. The course gives students exposure to theories and best practices in the field and better prepares graduates to enter leadership roles in a wide variety of organizations.”

    BIG data Analytics:

    While industry understands the importance of BIG data and the need to have its own graduate program concentrated in BIG data there is no full fledged BIG data MS program yet.

    However, there is an online program from Stanford, Mining Massive Datasets Graduate Certificate

    The following set of courses are part of such training certificate:

    Mining Massive Data Sets Graduate Certificate

    For more information, visit the Mining Massive Datasets Graduate Certificate page.

    LSU MS in Analytics Program:  The complete material is from LSU School

    “The Master of Science in Analytics program has a practical orientation. The MSA degree is designed to prepare students to use data-driven methods to contribute to organizational effectiveness and guide decisions. The curriculum emphasizes the use of business analytics, business intelligence and information technology to solve problems, reduces costs, increase revenues, streamline processes, and improve decision-making. Students learn specialized skills and knowledge drawn from the fields of computer science, statistics, operations research, and quality management to achieve results through a mixture of classroom instruction, hands-on exercises, and team-based projects. Key topics include the structured query language (SQL), multivariate statistics, clustering, data-mining, design of experiments, optimization methods, and predictive modeling. Teamwork, written and oral communication, presentation skills and state-of-the-art visualization techniques are also stressed throughout the curriculum.  

    Key Points:  
    • Intensive 10 month, 36 course hour program.
    • Begins in the Summer Intersession Semester, not the Fall Semester.
    • Integrated curriculum includes data management, statistics, operations research, and information technology. It is designed to produce a well-rounded business analytics & business intelligence professional ready to succeed in private industry as well as in government and non-profit organizations.
    • The program’s content includes a variety of specialized business areas such as financial analytics, web analytics, marketing analytics, supply chain analytics, and healthcare analytics.
    • The program has a practical orientation. It uses large real-world databases with hands-on exercises to prepare the students to be productive employees immediately as they transition from the classroom to the workforce.
    • Close supervision and mentoring by ISDS Department faculty.
    • A cohort learning model that emphasizes team-based learning.
    • The program will be housed in the newly constructed Business Complex.
    • Opportunities to earn professional certifications from SAS and other sources during the program.
    • Above average salaries in a growing professional career area.
    • Goal is placement at Fortune 1000 companies.

    Degree Requirements

    • 36 hours of graduate level course work with a 3.0 average or above
    • Successful completion of a major project and passing the defense of the student’s project
    A typical course list is shown here:

    Summer Intersession
    Fall Semester
    Spring Semester
    EXST 4025
    ISDS 7990
    SAS Data Access & Management,
    Data Cleaning,
    SAS Programming
    Formulating problems,
    Scoping projects,
    Select, construct & clean data,
    Consulting skills
    Evaluating results, Communication & Teamwork,
    Reports & Technical Writing,
    Reviewing Projects
    Area Knowledge
    ISDS 7540
    ISDS 7220
    Marketing Analytics,
    Web Analytics,
    Healthcare Analytics
    Supply Chain Analytics,
    Forecasting Analytics
    Advanced Analytics I
    EXST 7087
    ISDS 7103
    Design of Experiments,
    Survival Analysis,
    Advanced SAS Programming
    Operations Research,
    Bayesian Models, Scorecards,
    Stochastic Optimization
    ISDS 7024
    Advanced Analytics II
    EXST 7142
    ISDS 7070
    Exploratory Data Analysis and Visualization,
    Regression with Diagnostics, ANOVA, Tables, Statistical Programming,
    Presentation skills
    Data Mining,
    Classification Models,
    Cluster Analysis,
    Fraud Detection ,
    Risk Assessment, GIS,
    Text Mining,
    Financial Analytics,
    Process Improvement
    Business Intelligence
    ISDS 7510
    ISDS 7511
    Relational Databases &
    Data Warehouses,
    Data Querying & Reporting,
    Dimensions & Cubes
    Advanced Dimensions & Cubes,
    Dashboards, KPIs, Reporting,
    Business Analytics & Knowledge Management

    Admission Requirements

    Admission to the program is competitive.  Minimum requirements for admission into the program are listed below.
    • A GPA of 3.0 or higher.
    • GMAT or GRE scores that meet the Graduate School’s requirements (the ISDS Department accepts either GMAT or GRE scores).
    • International students must have TOEFL scores that meet the Graduate School’s requirements (see Students who have graduated from a non-us school should check the LSU Graduate Schools Admission Requirements to determine if a TOEFL Exam is required.
    • International students must certify that funds are available to pay all costs while studying at LSU. (See  and
    In addition to the Graduate School’s requirements, the ISDS Department requires the following items.
    • Résumé
    • Statement of Purpose
    • Three letters of recommendation
    These materials must be submitted with your other application materials.  Please do not send application materials to the ISDS Department directly. Submit them through the online application system.

    Completion of a graduate degree, an undergraduate degree that is more relevant, a higher GPA, higher test scores, and more relevant experiences will improve your chances of being admitted.  Your chances of gaining admission are better if you have any of the following:

    • Work experience in business intelligence, business analytics, data mining, data warehousing, data management, programming, web development, web analytics, risk management and related fields is advantageous.
    • A degree in a field that is very relevant to the MSA degree.  Some examples include: business, industrial engineering, management science, operations research, statistics, or computer science.
    • A degree in a field that emphasizes mathematics, statistics, or data base management.  Some examples include: any engineering degree, computer science, management science, most of the other sciences, operations research, production/operations management, economics, statistics, mathematics, and industrial/organizational psychology.

    From Data Monster & Insight Monster

    An Incredible offer from O’Reilly Books and Videos – STRATA 2012 Videos – $299 with Possible Discount

    Strata Conference 2012: Day 1
    SQL and NoSQL Are Two Sides Of The Same Coin40 minutes
    From Knowing “What” To Understanding “Why”42 minutes
    The Model and the Train Wreck: A Training Data How-to25 minutes
    Corpus Bootstrapping with NLTK15 minutes
    The Importance of Importance: An Introduction to Feature Selection28 minutes
    Social Network Analysis Isn’t Just For People40 minutes
    Array Theory vs. Set Theory in Managing Data45 minutes
    Survival Analysis for Cache Time-to-Live Optimization28 minutes
    The Data Science Debate (Free Preview)58 minutes
    Introduction to Apache Hadoop Part 1-55 minutes
    Introduction to Apache Hadoop Part 2-38 minutes
    Introduction to Apache Hadoop Part 3-30 minutes
    Introduction to Apache Hadoop Part 4-33 minutes
    The Two Most Important Algorithms in Predictive Modeling Today Part 1-34 minutes
    The Two Most Important Algorithms in Predictive Modeling Today Part 2-37 minutes
    The Two Most Important Algorithms in Predictive Modeling Today Part 3-54 minutes
    The Two Most Important Algorithms in Predictive Modeling Today Part 4-58 minutes
    Large scale web mining Part 1-45 minutes
    Large scale web mining Part 2-44 minutes
    Large scale web mining Part 3-40 minutes
    The Craft of Data Journalism Part 1-37 minutes
    The Craft of Data Journalism Part 2-45 minutes
    The Craft of Data Journalism Part 3-38 minutes
    The Craft of Data Journalism Part 4-47 minutes
    Big Data Without the Heavy Lifting Part 1-42 minutes
    Big Data Without the Heavy Lifting Part 2-31 minutes
    Big Data Without the Heavy Lifting Part 3-38 minutes
    Big Data Without the Heavy Lifting Part 4-33 minutes
    Big Data Entity Extraction With Less Work and Less Code Part 1-42 minutes
    Big Data Entity Extraction With Less Work and Less Code Part 2-33 minutes
    Big Data Entity Extraction With Less Work and Less Code Part 3-47 minutes
    Big Data Entity Extraction With Less Work and Less Code Part 4-32 minutes
    Introduction to R for Data Mining Part 1-48 minutes
    Introduction to R for Data Mining Part 2-43 minutes
    Introduction to R for Data Mining Part 3-47 minutes
    Introduction to R for Data Mining Part 4-53 minutes
    Building Applications with Apache Cassandra Part 1-19 minutes
    Building Applications with Apache Cassandra Part 2-48 minutes
    Building Applications with Apache Cassandra Part 3-28 minutes
    Building Applications with Apache Cassandra Part 4-34 minutes
    Hadoop Data Warehousing with Hive Part 1-49 minutes
    Hadoop Data Warehousing with Hive Part 2-40 minutes
    Hadoop Data Warehousing with Hive Part 3-1 hour 32 minutes
    Hadoop Data Warehousing with Hive Part 4-42 minutes
    Hands-on Visualization with Tableau Part 1-57 minutes
    Hands-on Visualization with Tableau Part 2-30 minutes
    Hands-on Visualization with Tableau Part 3-36 minutes
    Hands-on Visualization with Tableau Part 4-30 minutes
    Designing Data Visualizations Workshop Part 1-46 minutes
    Designing Data Visualizations Workshop Part 2-27 minutes
    Designing Data Visualizations Workshop Part 3-59 minutes
    Designing Data Visualizations Workshop Part 4-38 minutes
    Developing applications for Apache Hadoop Part 1-45 minutes
    Developing applications for Apache Hadoop Part 2-49 minutes
    Developing applications for Apache Hadoop Part 3-45 minutes
    Developing applications for Apache Hadoop Part 4-35 minutes
    What Marketers Can Learn From Analysts41 minutes
    Jumpstart Welcome11 minutes
    Big Data and Supply Chain Management: Evolution or Disruptive Force?37 minutes
    Ammunition for the CFO: How to be a Hard-Nosed Business Customer for Analytics25 minutes
    3 Essential Skills of a Data Driven CEO25 minutes
    Business Intelligence: What have we been missing?33 minutes
    Do it Right: Proven Techniques for Exploiting Big Data Analytics18 minutes
    The Business of Big Data53 minutes
    Big Data, Serious Games, and the Future of Work22 minutes
    It’s Not Just About the Data……the Power of Driving Impact Through Intent and Interconnectedness24 minutes
    Wrap-up Session32 minutes
      2.  Strata Conference 2012: Day 2
    The Apache Hadoop Ecosystem10 minutes
    Decoding the Great American ZIP myth10 minutes
    Guns, Drugs and Oil: Attacking Big Problems with Big Data9 minutes
    Machine Learning and Big Data: Sustainable Value or Hype?6 minutes
    Learning Analytics: What Could You Do With Five Orders of Magnitude More Data About Learning?6 minutes
    A Big Data Imperative: Driving Big Action15 minutes
    The Information Architecture of Medicine is Broken15 minutes
    Do We Have The Tools We Need To Navigate The New World Of Data?13 minutes
    Street Fighting Data Science36 minutes
    Data Ingest, Linking, and Data Integration via Automatic Code Generation38 minutes
    Disambiguation: Embrace wrong answers and find truth31 minutes
    Netflix recommendations: beyond the 5 stars48 minutes
    Data Science in Product Development27 minutes
    Mo’ Data, Mo’ Problems32 minutes
    Business Management Strategies for Big Data46 minutes
    Becoming a Data-Driven Organization36 minutes
    Building a Data Strategy: Data Enabling Toys at Leapfrog40 minutes
    Analytics in a Community-Driven Fashion Retailer36 minutes
    Data Science in Marketing Analytics30 minutes
    Science of Visualization42 minutes
    Effective Data Visualization39 minutes
    Building a Data Narrative: Discovering Haight Street38 minutes
    Crafting Meaningful Data Experiences24 minutes
    Roll Your Own Front End: A Survey of Creative Coding Frameworks41 minutes
    Sketching With Data32 minutes
    The Future of Hadoop: Becoming an Enterprise Standard35 minutes
    Hadoop + JavaScript: what we learned39 minutes
    Architecting Virtualized Infrastructure for Big Data45 minutes
    Aggregating and serving local places data and ads at Citygrid46 minutes
    Exploring Social Data: Use Cases for Real-World Application35 minutes
    Understanding Social Contagion36 minutes
    Changing Data Standards from Wall Street to DC and Beyond45 minutes
    Big Data: Wall Street Style40 minutes
    Big Data = Bigger Metadata37 minutes
    Linked Data: Turning the Web into a Context Graph33 minutes
    Data as a Strategic Weapon – Walmart, Netfix and Apigee Panel Discussion42 minutes
    Creating Real Business Value with Big Data Analytics46 minutes
    Getting the Most from Your Hadoop Big Data Cluster39 minutes
    Amazon DynamoDB: A seamlessly scalable NoSQL service43 minutes
    Turning Big Data Into Competitive Advantage37 minutes
    Unleash Insights On All Data With Microsoft Big Data46 minutes
    SQLFire – An Ultra-fast, Memory-optimized Distributed SQL Database43 minutes
    MapReduce for the Rest of Us: Unlocking Data Science for the Business User34 minutes
    Automated Understanding – The Next Evolution in Big Data Analytics41 minutes
    RHadoop, R meets Hadoop35 minutes
    Monitoring Apache Hadoop – a big data problem?31 minutes
    How to develop Big Data Pipelines for Hadoop38 minutes
    How Crunch Makes Writing, Testing and Running of MapReduce Pipelines Easy, Efficient and Even Fun!46 minutes
    Analyzing Hadoop Source Code with Hadoop37 minutes
    Strata 2012 Startup Showcase4 minutes
      3.  Strata Conference 2012: Day 3
    Democratization of Data Platforms7 minutes
    5 Big Questions about Big Data9 minutes
    The Trouble with Taste8 minutes
    Embrace the Chaos9 minutes
    Open Data and the Internet of Things10 minutes
    Big Data’s Next Step: Applications5 minutes
    Heritage Provider Network, Announces the Winner of the Second Heritage Health Progress Prize4 minutes
    Using Google Data for Short-term Economic Forecasting15 minutes
    Is this normal? Finding anomalies in real-time data34 minutes
    From Predictive Modeling to Optimization: The Next Frontier35 minutes
    Mining Unstructured Data: Practical Applications34 minutes
    Migratory data: the distributed data you carry with you42 minutes
    Humans, Machines, and the Dimensions of Microwork43 minutes
    Big Data and Bibliometrics: Crowdsourcing the World’s Largest Database of Research31 minutes
    Democratizing BI at Microsoft: 40,000 Users and Counting41 minutes
    Mining the Eventbrite Social Graph for Recommending Events47 minutes
    Data Jujitsu: The Art of Turning Data into Product44 minutes
    Data Marketplaces for your extended enterprise: Why Corporations Need These to Gain Value from Their Data46 minutes
    Big Data Meets Big Weather43 minutes
    Improving Productivity Using Real-Time Data37 minutes
    Video Graphics – Engaging and Informing41 minutes
    Rich Sports Data and Augmented Reality41 minutes
    Visualizing Geo Data42 minutes
    Beautiful Vectors: Emerging Geospatial technologies in the browser38 minutes
    From Big Data to Big Insights43 minutes
    Exploring the Stories Behind the Data40 minutes
    Hadoop Analytics in Financial Services32 minutes
    Using Map/Reduce To Speed Analysis of Video Surveillance34 minutes
    Beyond Map/Reduce: Getting Creative With Parallel Processing38 minutes
    Petabyte Scale, Automated Support for Remote Devices39 minutes
    Big Analytics Beyond the Elephants1 hour 0 minutes
    If Data Wants to Be Free, is Privacy a Prison?40 minutes
    Pretty Simple Data Privacy42 minutes
    OODA Loop: How to Understand the Use Cases for Big Data43 minutes
    It’s Not “Junk” [Data] Anymore42 minutes
    Big Data for the Common Good42 minutes
    Personalized Medicine and Individual Cancer Care, it is a data problem1 hour 0 minutes
    Solving big data analytics with an emerging data-centric language35 minutes
    Big Data and Machine Learning: A Reality Check40 minutes
    Big Data Big Costs?40 minutes
    Big Data Meets the Big Cloud: How To Monitor Thousands of Servers39 minutes
    Big Data and the Social Firehose40 minutes
    Big Data Applications in Action30 minutes
    Start Innovating! Crowdsourcing and Big Data41 minutes
    Apache Cassandra: NoSQL Applications in the Enterprise Today41 minutes
    Storm: distributed and fault-tolerant realtime computation47 minutes
    Analytics from 330 million smartphones43 minutes
    Connecting Millions of Mobile Devices to the Cloud24 minutes
    Open Source Ceph Storage: Scaling from Gigabytes to Exabytes with Intelligent Nodes40 minutes
    Mapping social media networks (with no coding) using NodeXL41 minutes

    From Data Monster & Insight Monster

    For True Prediction You Have to Future Proof the Application

    Future proofing your prediction means the model takes into consideration factors that affect future states of the application of models, which introduces patterns of variation in the score possibly.  This word, I heard for the first time used by my great friend David Vogel in his kaggle competition write up.

    The key future-proof demanding situations are created by the various dynamics that come into play in the application of predictive models.  See: