Why and How Libraries Should Use Student Data: Two Perspectives

Why (and How) libraries should use student data to measure the relationship between library services, spaces, etc., on student outcomes.

In this conversation, two librarians at FSU share their perspectives and experiences on student data. Why should we use student data? Why shouldn’t we use student data? Below, read Adam Beauchamp and Kirsten Kinsley’s take on student data.


By Adam Beauchamp, Humanities Librarian

“How” should libraries use student data to measure the relationship between library services, spaces, etc., on student outcomes?

In short, very carefully. 

There are two questions, one methodological and one ethical, that I ask myself when considering the use of student data in library assessment. First, do these data actually measure the outcome I want to assess? Second, does the potential benefit to students from using data this way offset the potential harm to students’ privacy or intellectual freedom? 

Libraries collect a lot of transactional data in the course of normal operations, and it is tempting to use these data to demonstrate the library’s value. But the mistake I see often in the library literature is that librarians conflate a student’s simple use of the library with the broadest (and most flawed) measure of student learning: GPA. A high GPA may be indicative of a student who does well in general on tests and class assignments, but GPA doesn’t tell you what or how a student has learned. 

Similarly, counts of library transactions like checking out a book, attending a workshop, clicking on an electronic journal article, or passing through library turnstiles do not tell you how, or even if, those transactions resulted in student learning. In the first example, did our student read the book, fully understand its thesis, and successfully incorporate those ideas into her own thinking and writing? Did she even select an appropriate book for the assignment in the first place? None of these questions are answered by circulation statistics. Asked another way, if learning did not take place in this scenario, is it the fault of the library? If the library cannot be held liable for the student who checks out a book and doesn’t learn, can the library take credit for the same transaction leading to a positive outcome without justifying how?

If library transactional data alone are insufficient measures of student learning, one response is to collect more, better student data. How can we know if the student above learned from checking out a book? Perhaps if we had her entire search history and records of every other book and journal article she had looked at, we could then hypothesize about whether or not she had deployed a logical search strategy and had selected the “right” book for the assignment. But now we have a serious ethical problem. Is it appropriate for libraries to conduct this kind of surveillance of our users? If the student finds out we are scrutinizing her every keyword and mouse click, would she think twice about searching for or reading materials on certain subjects? Are we sharing these data with her professor, who might deduct points for “irrelevant” searches or checking out the “wrong” books? Have we gotten any closer to discovering how the student is or isn’t learning such that we could alter our practices to benefit her and other students?  

In the ALA Code of Ethics we are called to offer the highest quality service, but we cannot let pursuit of that value cause us to trample on our equally important mission to support intellectual freedom, fight censorship (including self-censorship), and protect library users’ right to privacy and confidentiality. Therefore, we must think very carefully before using student data in an effort to connect library use in all its forms to student learning outcomes. We must be sure that such use of student data is a valid measure of the outcome we wish to assess, and that the potential benefits to students outweighs the potential harms to students’ privacy and intellectual freedom at the center of our professional ethics.


By Kirsten Kinsley, Assessment Librarian

My position is for purposes of modeling Critical Thinking and Healthy Discussion–the value of the month and to start the conversation on big data and student privacy:

An example of the data we collect on students:

We collect data on a number of library services, outreach, spaces, and resources/collections as demonstrated by the data inventory completed in the fall. One of the datasets assessment collects is card swipe data for Dirac and Strozier.  From this data we can determine who uses the library, how frequently, and for how long. The data includes student EMPLIDS, a nine digit code identifier that serves to connect individual student user card swipes with the Office of Institutional Research Business Intelligence (BI) data warehouse.  The BI system is used to pull any number of variables about each student who enters the library. Some of the student variables include: major, year of study or status, semester or overall GPA, retention, class load, whether they are an athlete, a part of Greek Life, or a veteran, etc. We pull demographic variables such as age, sex, race, and pre-college variables, such high school GPA, SAT/ACT scores and Pell grant recipients.

We include all these variables about individual students and examine them in the aggregate because theories and research in higher education conclude that a lot goes into what makes a student successful in college and there are many ways to measure that.  Some point to pre-college variables, such as SAT/ACT scores, as factors that contribute to success. Other studies have shown that collegiate engagement in programs like CARE, living-learning communities, athletics or studying abroad play a role in student success. Using statistics, we can also tease out the relationship between library usage and student success.

In order to be thorough, analysis includes as many factors as we can by holding some variables constant (such as high school GPA, which can have a hidden effect on student library usage) while measuring whether something like first year retention rates may be affected by library usage.  We do this knowing that we cannot account for all variables, such as personal motivation.

You might ask, “How do we know that it is not something about a student’s good study habits and not in using the library space itself that makes them successful?”  In other words, good students self-select to go to the library. This is self-selection bias and for this we apply statistical techniques like precision matching to mitigate it. One way to do that is by creating a comparison group by matching each library user’s demographics and characteristics with a non-library user. For example: A student who visits the library physical space is matched with a non-library using student on characteristics like ethnicity, age, year of study, major, high school GPA, SAT score, etc.  What is being compared is the differences of frequency of library visits between them and to estimate the effects it has on first year retention rates. From this a comparison group of library users with a matched group of non-library users is created. We can apply this technique to measure whether the library user group has a statistically significant higher semester-to-semester retention rate.

All the student data that we collect are anonymized and analyzed in the aggregate.  We do not want to know about a particular students’ library participation, but we do want to know what trends there are that relate to library usage in general.  Aside from library spaces, we know that the library provides many important services, configurations of spaces, and resources that help our users. So measuring the impact of as many library variables together as we can will build our case that the library makes a difference. Since this process involves a lot of datasets, we need a secure and safe data warehouse to store it.

Why should we use all that student data do this?: To show that libraries play a role in student success.

To compete for campus funding:

If we don’t hold ourselves accountable and demonstrate our value and impact on student, faculty, and staff success, who can we count on to advocate for us on campus? Others campus divisions and stakeholders , such as the Academic Center for Excellence (ACE) and the Division of Student Affairs are and will be vying for dollars by showing evidence of their contributions. [Look at Goal IV & V of the Strategic Plan: Student Success and the focus of those Goals includes student advising initiatives]. FSU’s operating budget includes E&G Funds (44.36%) and the Libraries share a portion the E&G budget with other campus stakeholders. Of those funds, 11.80% comes from tuition and fees, and the remaining 32.56%  percent comes from state support.

Statewide

The Board of Governors (BOG) oversees the distribution of some large sources of funding: Preeminence funding,  Performance-based funding, and National Ranking Enhancements that are distributed to the eleven schools in the State University System (SUS). At FSU, “performance funding currently accounts for approximately 22% of all Education & General (E&G) dollars, the principal source of university operating funds” (Daly PowerPoint 2019) and is based on metrics. Metrics that libraries can contribute to include: four year graduation rate, academic progress rate (2nd year retention with GPA above 2.0), and others. Another source of income that is getting more and more competitive is Preeminence funding. Of the $20 million recurring funds for the SUS–FSU’s portion is $6.1 million.

Nationally:

ACRL’s Value of Academic Libraries Initiative (2010) was strongly supported with IMLS grant funding projects and programming and continues to be a driving force for the impetus to conduct this research that demonstrates Value and Impact. Currently, they offer grants for up to $3000 for libraries who want to conduct research to demonstrate value.

While funding is not the only reason we should measure our impact on student success, it is clearly a compelling reason.

So, how do we measure the Libraries’ impact on student success, while also honoring our values?

  1. Foster partnerships: By partnering with other campus colleges and departments as we have successfully done in the past and by utilizing high standards of research practice and methodologies, we raise awareness across campus that the library makes a difference in the lives of students, faculty, and staff.  
  2. Adhering to research standards: We ask questions and let the data speak, not the other way around as in data trolling or dredging. For quantitative research we frame our questions supported with theories tested in higher education [for example, for students we could use Astin’s Theory of Student Involvement (1984) or Tinto’s Institutional Departure Model (1975, 1993)]. We don’t give the data to third parties. We do it to show them the value of students’ tuition or tax dollars. We can do all this while still honoring our professional values (e.g., ALA Code of Ethics),
  3. Model good data stewardship: We can be a role model for not only how we adhere to good research practice, but by being transparent to library users that we collect their information to improve their services, spaces, and collections in the aggregate. We adhere to stringent privacy considerations and make sure we are in alignment with campus data governance. We are good data stewards by maintaining high standards of data management practices and protocols–such as how we store, secure, and de-identify data. We do this using the same research standards and protocols of the university.  We develop a privacy statement and a way for users to opt out of library research should they want to.

Summary

We need to be proactive about demonstrating impact and value to the institution and advocate to stakeholders our value.  Aside from competition for campus funds, we need to hold ourselves accountable to measure that what we do matters. If any part of this institution is capable of measuring impact and value with care and consideration of its users, it is the Libraries. Our values and concerns will keep us balanced between contributing evidence-based data and the practices of privacy, and keeping what we measure within the bounds of reason. Let’s not leave it up to the vendors to decide the granularity of data we seek, but let us ask the questions, within the bounds of our values, and conduct sound research practice in good faith knowing that there will never be data that give us certainty and definitive answers, only a compass to point the way.

Note: References provided for my assertions provided upon request.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s