Basics of Scientific Research Method…
Extensions of these is what gave birth to FDA, and other institutional methods, such as IRB.
And, modern Statisticians help. Lack of that, “experts” are not willing to believe, even if one injects the filth to oneself trying to disprove or prove. Examples from nutrition and health.
Driven by interest from a student, as I started collecting the material, it became big enough that I thought it is better I put it in the blog than write a long email. I realize, there are other people who may benefit from this note.
Here it is. This selection is about what is out there and so they are not ordered in terms of importance, nor provided with insightful evaluations. The details are actually sourced from the web sites and are with in “quotes” to indicate that they are not mine words.
It looks like a great opportunity is developing in venture capital analytics.
- http://correlationvc.com/approach claims having largest database of US venture capital financings. Use to model on the basis of past successes, “tracking everything from key financing terms, investors, boards of directors, management backgrounds, industry sector dynamics and outcomes”. They build models and score the database based on proprietary information from the client and a collection of variables from the database to see how new investments can be made
- http://www.analytics-ventures.com/ “mission is to invest at the earliest stages of a company to leverage our domain expertise in big data and analytics in order to help companies get quickly on the path toward sales, growth and profitability”
- http://www.venture-science.com/venture-capital-analytics uses “Multi Criteria Decision Analysis (MCDA) methods provide much more powerful guidance than the traditional gut-feeling approach.
Numerical approach to capital deployment decisions
The following questions are examples of important capital deployment decisions:
- Company X is raising $10M. How much should we invest? $2M, $4M, $7.5M?
- We have an opportunity to participate in a new round of a current portfolio company. How much should we put in? 3x original? 5x? Pro-rata?
- How much dry powder do we leave? 50% each year, 25%? All in by year 3?
Funds must try to calculate optimum levels of investment, reinvestment and dry powder based on their strategy and the return expectations of the market. Methodically solving for these unknowns establish a stronger model for capital deployment than heuristics.
In a portfolio with n companies, n+1 changes the risk/return characteristics. Looking at the incoming deal flow and re-up opportunities we evaluate possibilities of permutations of portfolios. Since we can quantify the risk of startups in multiple dimensions, we can minimize overall risk of the portfolio at n+1 and farther down the line.”
- http://info.crunchbase.com/about/crunchbase-venture-program/ provides database of “open, up-to-date, and accurate database of companies, investors, and entrepreneurs. Through this initiative, venture funds, angel groups, accelerators, and incubators can guarantee that their public data is accurately represented inside CrunchBase”. This is a free database.
- http://gigaom.com/2013/11/02/venture-capital-in-an-age-of-algorithms/ discusses the new science of algorithmic approach to venture capital investments.
The multivariate regression used in PROC REG (SAS) and lm (R language) will yield same regression coefficients if you use all the variables in each (dependent var1, dependent var2, …) of the multiple regression model, even if the dependent variables are highly correlated. The purpose of this structure commonly used is for understanding and estimating the more “powerful” – statistical power – testing you can do with all the regression model parameters at the same time.
The alternative to utilize the correlation among the dependent variables is addressed by Seemingly Unrelated Regression (SUR) Models which considers each of the model with (dependent var1, dependent var2, …) as a system of equations and correlations between equations are taken into consideration.
The PROC to use in SAS is PROC SYSLIN. The link http://www.ats.ucla.edu/stat/sas/webbooks/reg/chapter4/sasreg4.htm in section 4.5.1 provides examples specifically for the “SUR” section with SAS PROC SYSLIN application. These can also handle different set of variables in different models.
“…Proc syslin with sur option and proc reg both allow you to test multi-equation models while taking into account the fact that the equations are not independent. The proc syslin with sur option allows you to get estimates for each equation which adjust for the non-independence of the equations, and it allows you to estimate equations which don’t necessarily have the same predictors. By contrast, proc reg is restricted to equations that have the same set of predictors, and the estimates it provides for the individual equations are the same as the OLS estimates. However, proc reg allows you to perform more traditional multivariate tests of predictors.”
To use R, the link http://www.ats.ucla.edu/stat/r/faq/sureg.htm provides examples of application.
The reference I provided for correlated ordered logistic regression is still valid but you have to use the interaction of dependent variables on the right side of the of the equations.
Here too, the very popular method that I recommend is to explore the boosted random forest which is usable for both classification and regression, especially on the Kaggle project, following the spirit of reference for correlated ordered regression model mentioned above.
There is an advanced package available in R with function vglm under VGAM, Vector Generalized Additive Model. The notes about VGAM is extensively documented in http://cran.r-project.org/web/packages/VGAM/VGAM.pdf.
This is a fun article and I think truly these 99 plus reasons should factors in our machine learning approach if the algorithm has to predict with higher accuracy, much higher accuracy.
Yes, I use them in my own humble way to predict stock prices.
7APR14 – Today is my first example. I will continue to edit this and expand. I am not sure when I will complete with 99 reasons.
Today, TWTR tumbled for two main reasons
– It has not shown how to dramatically increase the revenue and the early employees and investors were allowed to sell for time, what is called “Expiry of Lock Up Period”. The following is a capture of finance.yahoo.com screen shot (I will always be using this source for all the examples)
Do you want to learn R, Python, Java, … hundreds of different programming languages.
- Students can use lynda.com to stay current in your courses or for classes that require additional software skills.
The amazing list of full collection of courses are available at the link http://www.lynda.com/subject/all
The Northwestern University Learning & Organization Development (L&OD) team collaborates with faculty and staff who want to develop their talent and advance their workplace outcomes, processes and engagement. L&OD provides consulting, coaching, workshops, retreats and tools for individuals, groups and organizations.
Access to lynda.com and Registration for Summer Workshops Available
Have you heard? Northwestern recently partnered with lynda.com to provide all staff, faculty and students unlimited, on-demand access to a full library of online courses at no additional cost for a pilot period of one year.
To date, over 2,300 University members have already accessed nearly 1,000 hours of online learning.And, lynda.com isn’t just for learning computer applications – consider referencing some of the sample playlists to help you get started on topics like career development, leadership, computer applications and workplace skills.
In addition, registration is now open for workshops offered in June, July and August. Check your mailbox for our summer catalog or look online to see what’s coming up.