Monthly Archives: August 2013

Dimensionality Reduction Methods – Principal component analysis, Factor Analysis, Cluster Analysis, and Correspondence Analysis

This document will be growing consolidating concepts, examples, statistical estimation, computational methods, and references capturing the material that suits the title.

Factor analysis:

The algebra for statistical structures. This is part of 20 chapters collection of notes used by Prof. Shalizi in CMU for Data mining course.

A presentation type writing and deck is here: http://www.statpower.net/Content/313/Lecture%20Notes/RegressionAndFA.pdf

On principal components vs. FA
http://www2.sas.com/proceedings/sugi30/203-30.pdf 

On exploratory vs. confirmatory FA.
http://www2.sas.com/proceedings/sugi31/200-31.pdf

A lecture where the mathematics of factor analysis is discussed
http://www.hawaii.edu/powerkills/UFA.HTM

For correspondence analysis, I recommend my writing here where it is discussed on how to interpret the SAS results, one aspect of application.
http://predictive-models.blogspot.com/2009/05/proc-corresp-how-to-assign-best-in.html 

In the other extreme, I am also excited about sharing the following.

http://predictive-models.blogspot.com/2012/09/dimensionality-reduction-hidden-gem.html

From Data Monster & Insight Monster

New Platforms for Survey Research – Smartphones, SMS, Web

In this survey application, Deloitte is discovering value of mobile in retail industry.

http://www.deloitte.com/assets/Dcom-UnitedStates/Local%20Assets/Documents/RetailDistribution/us_retail_Mobile-Influence-Factor_062712.pdf

Mobile influence will become nearly 3 times by 2016 compared to that of what is being experienced in 2013; from close to 5% to close to 20%.  For a lot of great information see the original noted above in the web link.

The smart phone survey method that was used to collect the above data is

Here is a slideshare presentation on Online Surveys vs. Mobile Surveys.

Here is a landscape summary of “Mobile Survey Platform” from About.com

Here is HOWTO.GOV landscape summary regarding online surveys.

The coverage bias is common in mobile web surveys.  Here is a peer reviewed article.

From Data Monster & Insight Monster

Bigdata Download – Yahoo Collection – Big Data = Big Money: These Two Companies Are Set to Profit, Says Porter Bibb

http://finance.yahoo.com/blogs/daily-ticker/big-data-big-money-two-companies-set-profit-131250649.html

Who is harnessing and who is capitalizing – RocketFuel – Internet Advertising Nework

Case studies referred:

The Sexiest Job of the 21st Century – Yahoo’s Bigdata Download take

‘Big Data’ Generates Big Returns: Q&A With VC Roger Ehrenberg

How big data saves money on legal fees

Firm Uses Cell Tower Data to Send Ads

 Big Data Download is the place such stories are discussed in Yahoo finance.

From Data Monster & Insight Monster

Some Top R Reference Materials Recommended

There is a lot of collection like this in the internet. My purpose here is to create a list for my friends/colleagues who ask for such recommendations.

A simple logical document is perhaps this one: http://cran.r-project.org/doc/contrib/Verzani-SimpleR.pdf 

Some of the logical traps are interestingly discussed here: http://www.burns-stat.com/pages/Tutor/R_inferno.pdf

This one is truely quicker and has snippets of codes for specific tasks.  

1. http://heather.cs.ucdavis.edu/~matloff/R/RProg.pdf – This 104 pages material was preparatory work by the author for his book, “The Art of R Programming”, by Norman Matloff, University of California, Davis.  This has a book type appeal.

2.  http://r-statistics.net/collections-r.html (kind of every day learning modules).  Similar to that is – http://www.cyclismo.org/tutorial/R/

3. A lecture type presentation deck is here: http://www.sph.umich.edu/csg/abecasis/class/815.04.pdf

4.  You want presentation about just R graphics – Here is a 75 slides presentation – http://faculty.washington.edu/cadolph/vis/graph_r.p.pdf

5. I like the following 103 pages ebook which is published by CRAN – the manager of R software: http://www.cran.r-project.org/doc/manuals/R-intro.pdf

6. One more R-guide ebook – 81 pages – http://stat.ethz.ch/CRAN/doc/contrib/Owen-TheRGuide.pdf

Here are some Youtube videos which I highly recommend because it is very engaging.

Here is the claim from Stanford university collection of lectures.  By the end of parts I, II, III participants will be able to:
· Interact with R using commands passed through the console
· Import and export data in various formats and transform those data in R
· Make statistical graphics plots (and more)
· Write small scripts and functions using the R language.

For a complete description of the classes & Installing “R” and other packages prior to the class: please see instructions for “Introduction to R programming I & II course” [pdf] http://elane/laneconnex/public/media/…

1.   PART1 here. http://www.youtube.com/watch?v=HKjSKtVV6GU – this provides details on installing in the first 20 minutes

2. PART2 here. http://www.youtube.com/watch?v=BsWY7uwbs70

3. PART3: Using R graphics – http://www.youtube.com/watch?v=mMaGsVXFfv8

4. To complete the above, here is how to create your own package. http://www.youtube.com/watch?v=8-dGf-7arFI

Tell me how this can be chiseled better for other readers, if you use this list.

Updated with time series applications in R:

http://www.statoek.wiso.uni-goettingen.de/veranstaltungen/zeitreihen/sommer03/ts_r_intro.pdf
http://www.statoek.wiso.uni-goettingen.de/mitarbeiter/ogi/pub/r_workshop.pdf 

From Data Monster & Insight Monster