Creating University Efficiency Through Data Driven Budgeting

Each semester, universities manage millions of dollars in scholarship funds, grants, loans, and tuition revenue. The university must budget for each allocation and develop management plans accordingly, affecting enrollment, student success, university revenue and even university relations.

Goldstrike Data has developed state-of-the-science financial aid budgeting algorithms;

  • Leveraging individual student data to simulate the most likely budgeting outcomes
  • Providing 4, 8, 12, and 16-month lead times

Goldstrike has reduced Michigan Tech’s financial aid relative budgeting error from approximately 10% to less than 3% on average!

Michigan Tech can now allocate funds and resources much more efficiently each semester!

…Goldstrike [is] delivering services that are, in my mind, beyond promising. This will soon be the new normal in higher education and Goldstrike has gotten us well ahead of the curve.
— Associate Vice President for Enrollment and University Relations

Michigan Tech Collaborates with Locally Founded Goldstrike Data for Financial Aid Budgeting

Goldstrike Data L.L.C., a big data company founded by a Michigan Technological University alumna, has harnessed the power of predictive analytics to create financial aid budgeting efficiencies while increasing student success at Michigan Technological University.

Goldstrike Data was founded by Ashley Kern while she was a data science master’s student at Michigan Tech. The company is supported by experts in data science and statistics alongside a team of Oracle database specialists, business analysts and subject matter experts. Goldstrike focuses on leveraging advanced machine learning algorithms in conjunction with statistical knowledge to solve big data challenges.

Recently, Goldstrike has developed predictive models to forecast financial aid budgets for returning undergraduates students.

The most recent financial aid budgeting models have just been validated with results from the Spring 2017 semester enrollment numbers. In May 2016 Goldstrike forecast the Spring 2017 financial aid budget within 1.5 percent of what was paid out. Their data-driven approach leveraged advanced predictive modeling techniques, reducing the university’s historical budget projection variance which had previously hovered around 10 percent.

Rather than calculating the percentage of students that re-enroll historically and then estimating the remaining financial aid to be paid for an annual budget, the powerful Goldstrike algorithms disseminate the student data to a much more granular level. These state-of-the-science forecasting methods result in a much higher degree of accuracy than is available through other commercially available solutions.

In addition to providing an annual financial aid forecast up to 12 months in advance, Goldstrike Data simulates various scenarios for each budget. This provides university decision-makers additional security in their budgeting and allocation decisions.

Associate Vice President for Enrollment and University Relations, John Lehman, notes, “There are obviously many aspects of this story that appeal to me. That fact that Ms. Kern uses a Michigan Tech education to help provide greater efficiency towards our operation being just one.  More importantly, she and her team at Goldstrike are delivering services that are, in my mind, beyond promising.  This will soon be the new normal in higher education and Goldstrike has gotten us well ahead of the curve.”

Goldstrike Data is also working with student success directors at Michigan Tech to identify individual at-risk students up to 8 months before students struggle. Advanced machine learning algorithms matching these at-risk students with individualized interventions will be used to increase overall retention and graduation rates.

Financial Sustainability and University Fundraising

Nationwide, state funding has decreased on average by 28% or $2,353 per student as of 2013 (Oliff, Palacios, Johnson, & Leachman, 2013). Meanwhile, the rising cost of tuition has far outpaced the rise of inflation. The average student from the class of 2016 owes $37,172 in student loans and our nation's 44 million student loan borrowers have accumulated $1.3 trillion dollars in debt.

Higher education institutions have been under scrutiny in the public eye and are challenged to find and operationalize efficient business practices. According to Cathy Sandeen of the University of Wisconsin Colleges and Extension, this means helping staff and faculty focus on high-value tasks to make a difference for students, creating cost savings, create financial sustainability, expand enrollment and enhance capital and fundraising campaigns. 

Goldstrike Data provides small to medium sized universities data driven insights to enhance their fundraising campaigns to yield higher levels of financial contributions. By creating financial sustainability, these universities are able to invest capital into further research and development, help students progress towards their degrees, and expand enrollment which in turn increases revenue. 

Many smaller universities have a high proportion of Pell Grant recipients in their student body. These Pell Grant graduates, were found to be more likely to borrow student loans with 88% of them averaging $31,200 in debt upon graduation. In order to develop enrollment growth and continued student success, these universities are depending on the success of fundraising campaigns. 

Goldstrike Data leverages the data already available at your institution to provide data driven support to answer the following questions during your fundraising campaign:

  1. Which potential donors should we be targeting during this fundraising round?
  2. What tactics or marketing schemes increase the probability of them to give?
  3. What is the most cost effective way to target these high propensity donors?
  4. How and when should we move smaller donors into a major gift area?
  5. What is the statistically likely size of their donation? 
  6. How much overall funding can we expect in a given round campaigning? 

By providing statistical solutions to these questions, we minimize the risk in decision making by selectively targeting high propensity donors and providing a range of possible outcomes for more effective decisions. By automating our processes, customized to you, we optimize operational efficiency through time and cost savings so that the institutions advancement and development directors may focus on their donor relationships. 

Analytics Explained: the Full Package

Descriptive, Subscriptive, Predictive, Prescriptive 

Big data and analytics is very similar to writing a story or a report where the author addresses all of the big W's (and H) of any event.  The more information that can be derived from the story, the more value is added. The following article describes the four phases of analytics. 

  • What happened?
  • Why did it happen?
  • What will happen?
  • How do we control what will happen?




According to Dr. Michael Wu, the chief scientist of Lithium Technologies, more than 80% of analytics are descriptive. This includes most dashboard/personalized analytics software solutions that condense data into more useful nuggets of information. When working with Big Data, it may not even be possible to describe all of the data collected so just a sample of the collected data may be described. 

Descriptive analytics may be termed exploratory data analysis by a data scientist or statistician. This is the first and most important step in the problem solving process but it should not be thought of as the final product. It does not solve the problem but rather sets the framework for asking the correct questions. 

Descriptive analytics may include calculating averages, variances, percentiles, aggregated tables and a wide variety of creative visualizations. By the end of the descriptive phase of the project, we should be able to describe what occurred. 


Subscriptive or diagnostic analytics builds on the questions that were formed during the exploratory data analysis step. Using various statistical methods, we can begin formulating and testing hypothesis surrounding why things occurred. 

For example, an analysis of variance (ANOVA) may be performed to determine if certain factors have an affect on the outcome of interest. The effect the factor has on the outcome can be estimated and even simulated based on the changing state of the factor. The insights derived during the subscriptive analytics phase can indicate what information will be useful during the following predictive analytics phase. 


Using information and hypotheses developed during the descriptive and subscriptive analytics phases, predictive models may be developed to begin answering questions regarding what will happen in similar scenarios in the future. 

This can be as simple as fitting a trend line as seen in the figure to the right or be as complex as developing machine learning algorithms to forecast future data. The immense value behind predictive analytics is the ability to foresee potential challenges or successes in your businesses operations and being able to preemptively act accordingly. This leads us to our next question regarding the ability to guide future outcomes. 


Prescriptive analytics is used to recommend various courses of action based on simulated possible outcomes to these recommendations. According to Dr. Wu, prescriptive analytics predicts multiple futures based on decisions made now.

Since a prescriptive model is able to predict the possible consequences based on different choice of action, it can also recommend the best course of action for any pre-specified outcome
— Dr. Wu

This information can be used to optimize business processes. In any business process, you can control certain factors and then forecast the resulting effect on desired outcomes.

The ability to gain foresight and control over business operations can grant your organization powerful advantages. These methods have applications in sciences, engineering, social media, and all business so when looking for an analytics solution, make sure you are receiving the full analytics package. 

The Importance of Analytics in Student Interventions

At Goldstrike Data, our primary goal when working with universities and community colleges is to utilize predictive analytics to support individual student intervention planning and development. This is used to improve freshmen enrollment, retention and finally graduation rates.

Many universities use midterm preliminary grades or evaluations to identify at-risk or struggling first year students but by that time, an intervention may be too late. With recent advances in predictive analytics, it is possible to identify students who are likely drop out of school within the first or second semester of their attendance, before they step foot on campus or even apply to the university. This information may be used to support universities in both recruitment and first year programming development processes. 

Initial research performed at Goldstike Data has also indicated that semesterly collected student data is sufficient in identify at-risk continuing undergraduates as well as for forecasting retention rates for the continuing students.

Predictive modeling results extracted from data describing demographics, departmental associations, residency, scholarships and grants, athletic and club data, admissions information, student GPA's and credits earned have resulted in accurate year ahead identification of at-risk students. These results may be used for targeted student interventions with much earlier identification than previously available. 

After identifying these at-risk students, we can provide further insight into each student's key retention factors by exploring the following areas.

  • Prioritization of student interventions
  • Identification of students most likely to be positively affected by intervention
  • Identification of key retention factors and quantifying the positive or negative effects 
  • Simulating the affect of varying retention factors
  • Developing student summaries and profiles for advising focus 
  • Describing why some students who are at-risk, actually re-enroll 
  • Prescriptive analytics to increase retention and graduation rates

Goldstrike Data works directly with university administrators to test their hypotheses regarding retention factors and how to create positive change and increase student success. 

Data Driven Success for Higher Education


Goldstrike Data is a predictive analytics solution provider for higher education institutions that helps schools improve their financials, operating performance and student success.

Goldstrike Data is the only Ed-Tech company providing predictive analytics for small to medium size higher education institutions and community colleges. We provide university decision support by leveraging state of the science statistical and advance machine learning algorithms.


Why We Are Different

Of the few competing big data solutions available to universities, nearly all focus on forecasting retention and enrollment for groups of students such as freshman minorities. Goldstrike Data takes this a step further by identifying the probability of enrollment, retention and success for each individual student. We also identify the key retention factors affecting each student's potential for success so that a personalized intervention strategy can be developed for each student. 

By combining our retention and student intervention services, Goldstrike identifies at-risk students at your university or college and then provides data driven insights to support the most effective intervention strategies for particular students. Through our advanced machine learning methods, we can identify at-risk students a year in advance. 

Our Successes

Goldstrike Data recently engaged a mid-sized research university in Michigan and through our proprietary predictive modeling techniques, developed at-risk student identification models. As a result, 70% of the students that were identified as being at-risk, dropped out of school within the past year. Moving forward, these models will be applied to identify these students ahead of time to increase retention and graduation rates. 

Smart Data for Small Businesses

Is big data only useful to big corporations or should small businesses also be finding their competitive edge through data? Below are various tools and solutions for small businesses to use their data and the data surrounding them without completely changing the way their company operates and without burning a huge hole in their pockets.

Due to falling technology costs and new open source tools and data, smaller companies can mine new data insights to make informed decisions for their business. Most small business owners to not believe they can use big data to propel their company simply because they do not think they have access to big data. Even though your small business may not be recording gigabytes of data, there is plenty of public data that you can take advantage of.  

Sources on recommend cross-referencing your company’s internal data with rapidly expanding external data. Examples of this external data are social networks, government databases, mobile device usage patterns, call-center interactions, and newly available sensor data through the Internet of Things (IoT).

The overall goal is to understand and anticipate your client’s needs.

If your company has an internet presence, you can take a few steps to optimize your marketing strategy. For example you can track how visitors move from page to page and find what engages various visitors. By understanding what time of day or season customers engage, you can change when you deploy new marketing schemes. By simply describing your most popular customers you can target similar groups through online marketing.

If your small business is using external marketing or booking services online, it is also important to determine which of those services are increasing your return on investment and which are not generating business. This small step can save your small business thousands of dollars long term.

All of these examples seem like simple questions to answer but because the data is dispersed across different systems, the data needs to be integrated and presented to business managers in a form from which they can make final decisions.

Goldstrike Data recommends finding or hiring someone to develop personalized and flexible solutions. One size fits all solutions that you may find from larger service companies may be overly expensive and the startup cost will be much larger because they will have to develop full data infrastructure. By building on current data and operational systems your business has in place the startup cost will be smaller and the transition to using new tools will not be so difficult. 

Success Through Mentorship

Goldstrike Data is located in the Upper Peninsula of Michigan in a town where the population is still just shy of 8,000 people. In such a rural area with limited resources, I have found that one the most important factors affecting a young startup such as Goldstrike, is our network of mentors. My mentors have ranged from professors and programs through Michigan Technological University, SmartZone which is a business incubator, my family, and successful individuals from across the country.

During the first week of May, I was fortunate enough to be sponsored to attend TiECON in Santa Clara California which is a conference for entrepreneurs at all stages and from around the world. TiE’s mission is to mentor, network, educate and to encourage entrepreneurship, particularly for the next generation. The ‘pay-it-forward’ attitude found in Silicon Valley became very apparent to me through TiE’s mentorship program and the attitudes of successful entrepreneurs that I met at TiECON.

Kanwal Rekhi and MTU students at TiECON 2016

Kanwal Rekhi and MTU students at TiECON 2016

Furthermore, every mentor that I have had not only answered my questions and guided me… but gave me “homework”. I may have not been happy about this at the time but doing a few hours of extra researching and head scratching each week has gotten me far.

Tell me and I forget, teach me and I may remember, involve me and I learn.
— Benjamin Franklin

In late April Goldstrike Data graduated from SmartStart which is a SmartZone ran, hands on workshop where ideas are grown into successful businesses. SmartStart required a lot of homework and forces you to acknowledge and answer the tough questions that you did not even know existed. During this program, we discussed marketing, research, intellectual property, financial modeling and a pitched our companies at the final pitch night marking our graduation.

SmartStart graduation with fellow tech startups (April 2016)

SmartStart graduation with fellow tech startups (April 2016)

Goldstrike now has access to workshops, advertising opportunities, office space, peer groups, and a network of professionals in the area through SmartZone and collaboration with Michigan Tech. These resources are absolutely irreplaceable and I would not have been exposed to them without the entrepreneurial spirit that is fostered at Michigan Tech.  

Finally, my biggest supporters and mentors have been my family and friends. My father is a statistical consultant and entrepreneur who introduced me to this world at a young age for which I am extremely grateful. Without him I would not have had the confidence needed to get to this stage. I was also fortunate enough to gain the director of research computing at Michigan Tech as a mentor and close friend. He has always encouraged my hard work and ingenuity, but he also lets me dig for solutions on my own… even if it takes me a while to get there.  I am grateful for his patience, encouragement and frequent honesty.

As the founder and president of a technical startup, the best advice that I can give is to listen, question, and do your homework… even if you are not sure it is the “right answer”. 

Data Science Meets Sports Science

Sports teams can be seen as businesses that make up a larger economy such as the NBA or NFL depending on your market. To survive in this market does it take money to make money… or just informed decisions? Now that we have reached an era of data abundance, most sport teams are now taking advantage. Data can be used to select next year’s draft picks without exceeding salary caps, find the most underpaid player in the NBA, and for tracking players during practice for correlations to the game day

One of the revolutionaries of Big Data in the sports world was Billy Beane, the general manager of the Oakland A’s, who developed the Moneyball Theory. As Michael Lewis, the author of Moneyball, questioned

How did one of the poorest teams in baseball, the Oakland Athletics, win so many games?
— Michael Lewis

 The Moneyball Theory was based on the combination of two key statistics, the on-base percentage and the slugging percentage which made up a new statistic all together called on-base plus slugging (OPS). Beane’s work was based on Bill James’ sabermetric theories which involve the statistical analysis of baseball records (James, 1982).

Previous scouting methods involved qualitative scout assessments of players, observing traits such as strength, full arm extension and follow through, lack of fear, aggressiveness, stride length, and speed. As a result, experience and gut feelings steered the decision making process without quantitative metrics to directly compare players.

By utilizing quantitative measures on the other hand, Billy Beane was able to stream line the drafting process and understand the risk associated with each player he brought onto his team. Due to the Oakland A’s low budget, this assurance in the drafting process was crucial.

Professional sports teams are now investing in predictive analytics for potential draft picks as well as for maintaining the players they currently have. For example baseball teams are using Motus Global's Pitching Sleeve for tracking players motions to predict when a pitcher has a high probability of injury. As explained in the video below, team managers may collect data on pitchers through out the season to track any changes in their form. NBA and NFL teams are using similar technology tracking player’s movement during games and practice to optimize intensity and performance. This new combination of data science and sports science aims to improve on the health and success of tomorrows professional sports teams.

Drill Data Drill

The need for data science in industry 

‘Drill data drill’ has become the new mantra of the oil and gas industry. A report was released by Cisco Consulting Services in April of 2015 surrounding the new reality for the oil and gas industry and how changing marked dynamics are driving the need for a major digital transformation. Cisco’s 2015 study consisted of a survey of 50 oil and gas industry professionals and interviews with various energy consulting and marketing firms.

 Due to several key factors such as increased U.S. production and diminishing storage space for crude oil, it is projected that the price of oil may not bounce back to $100/barrel for many years if ever. In any business opportunity, if revenue is not foreseen to increase, the only way to increase or maintain profit margins is to cut costs elsewhere. Rather than implementing layoffs or shutting down sections of production, a digital transformation is on the horizon in order to increase processing efficiency and fully understand business management associated risks.

There are opportunities at every level of the oil and gas industry to increase operational efficiency. Based on an economic analysis performed by Cisco, utilizing operational data will improve upstream processes by reducing production, increasing rig uptime, increasing drilling efficiency, and improving remote monitoring and personnel safety. During midstream and downstream stages, fleet operations, reducing spillage, intelligent lighting, next-gen workforce, and smart refineries are all areas that may be improved through deriving data insights.

According to A CIO’s Guide to Using Gartner’s Digital Oil Framework (2014), currently many oil and gas companies are struggling to improve functional and business capabilities based on real-time operating data analysis. An off shore oil rig produces between 1TB and 2 TB of data per day. This time sensitive data may take up to 12 days to transfer to a central repository and by that time, any operations information that could have been extracted is no longer useful. This opens up the need for “edge or fog computing” in which any operational analysis may be performed on site through smart technology. 

In an interview with CNBC, Uptake's CEO Brad Keywell, estimated that only 1% of data derived in oil and gas operations is being presented to and used by decision makers in industry. For more long term analysis needs, data may be virtualized, meaning heterogeneous data types derived from all stages of an engineering process may be collected in one repository or cloud and treated as a logical database for users in any location. This process will help connect all components of a company to ensure efficiency through out. This virtualized data may even take into account external factors such as the economy to be included in the business analytics. 

Effective data capabilities also enable a more objective view of operations. Rather than relying on institutional knowledge or “gut reaction,” oil and gas firms can make better decisions, improve accuracy, and lessen risk. In addition, a common data platform can help break down language/communication barriers between different parts of the business, including IT and OT.
— Cisco Consulting Services, 2015

Cisco's survey reveals data and analysis deficiencies in only one industry. If we take a deeper look at clean energy, rail, infrastructure, environmental, and other civil engineering industries, we will hear the same story. There is a need for smart infrastructure to reduce costs and increase efficiency across the nation resulting in direct economic improvement. This opens up many doors for up and coming statistical and data science consulting groups who understand the importance of the engineering process and are passionate about improving our nation and resources.