Student Reasoning Patterns in NGSS Assessments

Share Video

Link

4447 Views (as of 05/2023)

Lei Liu
Senior Research Scientist
Presenter’s NSFRESOURCECENTERS
Educational Testing Service

Aoife Cahill
Managing Sr Research Scientist
Presenter’s NSFRESOURCECENTERS
Educational Testing Service

Xianyang Chen
Assistant Research Engineer
Presenter’s NSFRESOURCECENTERS
Educational Testing Service

Dante Cisterna
Assoc. Research Developer
Presenter’s NSFRESOURCECENTERS
Educational Testing Service

Devon Kinsey
Senior Research Assistant
Presenter’s NSFRESOURCECENTERS
Educational Testing Service

Yi Qi
Research Project Manager
Presenter’s NSFRESOURCECENTERS
Educational Testing Service

Student Reasoning Patterns in NGSS Assessments (SPIN-NGSS)

NSF Awards: 2000492

Presented in: 2021 (see original presentation & discussion)

Grade Level: Grades 6-8

The Next Generation Science Standards (NGSS) envision student multi-dimensional learning. To teach to these new standards, teachers need support to identify student reasoning patterns and provide appropriate feedback. Our project is a proof-of-concept study to develop and test an innovative toolkit that would automate diagnosis and feedback on student science learning. The project plans to make critical improvements to understanding student reasoning patterns associated with multi-dimensional learning of ecosystems, creating automated classification of those patterns, and providing immediate feedback for teachers and students. In this proof-of-concept project, we focus our research and development activities on three key components: assessments, an automated classification tool, and an automated feedback system. We examined the approved use of state assessment data and an IES-funded project data to develop, validate, and refine an automated classification tool to diagnose student reasoning patterns for making sense of ecosystem dynamics. We developed a coding scheme to identify four classes of students reasoning patterns and achieved satisfactory agreement among human raters and human-machine ratings. The goal of the AI modeling is to automatically predict a student reasoning pattern and identify associated evidence. The outcomes of this project will generate a feasible approach to automate the diagnosis of student reasoning patterns based on their written responses and provide individualized feedback to address critical gaps in student reasoning. In this video, we introduce characteristics of various student reasoning patterns and how we utilized natural language processing and machine learning techniques to automate the reasoning pattern diagnosis.

Keywords: Science, Technology, Addressing NGSS

Institution/Organization: ETS

NSF Program: EHR Core Research (ECR)

This video has had approximately 471 visits by 345 visitors from 158 unique locations. It has been played 361 times as of 05/2023.

Click to See Activity Worldwide

Map reflects activity with this presentation from the 2021 STEM For All Video Showcase website, as well as the STEM For All Multiplex website.

Discussion from the 2021 STEM For All Video Showcase (13 posts)

Lei Liu

Lead Presenter
Senior Research Scientist

May 10, 2021 | 04:59 p.m.

Welcome to our SPIN-NGSS video. In this video, we introduce characteristics of various student reasoning patterns and how we utilized natural language processing and machine learning techniques to automate the reasoning pattern diagnosis. We look forward to your feedback, questions, and comments.
Pati Ruiz

Facilitator
Learning Sciences Researcher

May 11, 2021 | 09:03 a.m.

Thank you for sharing this emerging technologies/NGSS project focused on reasoning patterns. Can you please tell us more about the deep learning pre-train transformer methods that you described using for this project and why you decided to use this method for training your machine learning (ML) technology tool. Another question I have is about the classroom orchestration that you envision for when teachers use this ML tool. How will feedback be provided to students? Are teachers involved in developing that feedback? If so, how? What type of control will teachers have when introducing and using this technology in their classroom?

Thanks again and I look forward to learning more!

2

Discussion is closed. Upvoting is no longer available

Found helpful:
Dante Cisterna
Lei Liu
Lei Liu

Lead Presenter
Senior Research Scientist

May 11, 2021 | 09:35 a.m.

Thanks for visiting our video, Pati. For your question about feedback to students, we are also collaborating with teachers to design automated feedback based on characteristics in student reasoning patterns. We are planning a small scale cog lab study to try out the automated diagnosis and feedback with a small group of students. The feedback will highlight both achievement and gaps in specific reasoning patterns. With the support of automated diagnosis/feedback, teachers will play multiple essential roles in the classroom orchestration. To list a few, based on identified patterns, teachers can group students and assign differentiated tasks to address gaps in their reasoning; they can probe further how students' background and life experiences relate to their habits of reasoning; they can adjust instruction accordingly to support multi-dimensioning reasoning. I will let our NLP expert from our team to answer your question about ML method.

2

Discussion is closed. Upvoting is no longer available

Found helpful:
Pati Ruiz
Dante Cisterna
Aoife Cahill

Co-Presenter
Managing Sr Research Scientist

May 11, 2021 | 11:03 a.m.

To respond to your question about the ML: we fine-tune a BERT model on our small dataset to make predictions about reasoning patterns. BERT is a transformer-based (https://arxiv.org/pdf/1810.04805.pdf) language model pretrained on huge amount of text (https://arxiv.org/pdf/1810.04805.pdf) and released by Google. We take this pre-trained language model and swap the output layer for a linear layer that predicts reasoning patterns, and fine-tune this model on our dataset. We chose this method because it has been shown to give good results across many kinds of tasks, and could be applied to small datasets (compared with other deep learning models that don't involve fine-tuning). We also compared this to an SVR using n-gram features in preliminary experiments and the fine-tuned BERT model performed consistently better.

2

Discussion is closed. Upvoting is no longer available

Found helpful:
Lei Liu
Dante Cisterna
Marcia Linn

Higher Ed Faculty

Marcia's video »

May 11, 2021 | 05:21 p.m.

Hi Lei! It's great to learn about the exciting work you are doing. Your video is thought provoking. We wonder if you are thinking about ways to communicate the information you are developing to teachers while they are teaching? See our work exploring the ways teachers use data while teaching for social justice in science. Enjoy, Marcia

1

Discussion is closed. Upvoting is no longer available

Found helpful:
Dante Cisterna
Lei Liu

Lead Presenter
Senior Research Scientist

May 11, 2021 | 06:35 p.m.

Thanks for watching our video, Marcia. You are spot on that it's important to communicate and collaborate with teachers to support their practice in classrooms, which is our next step. Your video is quite inspiring for us to consider next step design. Would love to have a follow up discussion. - Lei
Dalila Dragnic-Cindric

Facilitator
Postdoctoral Researcher

May 12, 2021 | 02:31 p.m.

Thank you for sharing your exciting work! You said in your video that the current accuracy is at 80%. I’d like to know how do you defined accuracy in this project? Could you tell us a bit about the remaining 20%, the types of patterns not accurately detected, and the steps you are taking to increase accuracy?

Thanks in advance.

1

Discussion is closed. Upvoting is no longer available

Found helpful:
Dante Cisterna
Lei Liu

Lead Presenter
Senior Research Scientist

May 12, 2021 | 04:45 p.m.

Thanks for checking out our video, Dalila. As Aoife mentioned below, we used macro f-score to calculate the human-machine agreement.

The human-machine agreement and human-human agreements fall in similar range. Those 20% are not necessarily inaccurately detected in my view. They could be ones that may need additional judgments (e.g., need more contextual information, or other factors may leads to this type of reasoning), which teachers may be able to leverage into their judgment. For all student responses, in addition to identifying the reasoning pattern, we also annotated texts as evidence for certain pattern (e.g., orange highlights for DCI; blue highlights for SEP). This way, teachers can also make their judgments by looking at machine-generated pattern label. We are also trying different models to improve accuracy and hopefully can report some improved accuracy later.

1

Discussion is closed. Upvoting is no longer available

Found helpful:
Dalila Dragnic-Cindric
Aoife Cahill

Co-Presenter
Managing Sr Research Scientist

May 12, 2021 | 03:36 p.m.

Thanks for your question about the evaluation. Our evaluation currently is in terms of macro f-score, the harmonic mean of precision and recall.
Suzanne Otto

Facilitator
Teacher / Fellow

May 12, 2021 | 05:45 p.m.

Thank you for sharing your research. I'm a HS science teacher and know that there is still much to be done to help teachers fully integrate the mindsets in the NGSS. Practical tools to help teachers are definitely in high need.

I'm wondering how you see this tool in use in a live classroom. Will students answer questions and then get immediate feedback for their own learning cycle? What reports will the teacher get to enable them to see both the big picture and still differentiate for individual student differences? Are you looking at ways to provide data patterns for each of the 3-dimensions in the NGSS?

1

Discussion is closed. Upvoting is no longer available

Found helpful:
Dante Cisterna
Dante Cisterna

Co-Presenter
Assoc. Research Developer

May 12, 2021 | 06:06 p.m.

Hello Suzanne! Thanks for watching the video and for your comments. Our goal is developing this model in a way that is useful for teachers and students. At this stage, we are developing the automated model to identify students' reasoning patterns and planning to receive teacher feedback.

Certainly, we have been discussing how to use the automated tool in the classroom. One idea that we have considered is to develop prompts for teachers with the goal of guiding student reasoning to more complex responses (based on the tridimensional framework) and based on certain reasoning patterns. Potentially, teachers can identify groups of students whose responses fit a particular reasoning pattern and provide feedback based on those prompts.
Kelly Billings

Researcher

Kelly's video »

May 13, 2021 | 08:47 p.m.

Hello Team, thank you for a wonderful video. I am wondering what thoughts or ideas you all have about how this will look for teachers and students. What does the front-end structure look like? What suggestions or data will be present to teachers and how do you anticipate this information might be used?

Thank you! Kelly
Lei Liu

Lead Presenter
Senior Research Scientist

May 14, 2021 | 09:19 a.m.

Hi Kelly, Thanks for your question. The current project is focusing on a proof of concept study focusing on the design of the automated tools. The scope of a large project is to embed the automated tools into classroom assessments. The front-end will be the classroom assessments where students can input their responses. Once students' inputs are in, there will be automated prompts generated by the back-end linked automated tools. The assessments will be delivered through a platform that includes teacher dashboard, where teachers can track student progress and automated diagnosis results as well as what type of feedback was generated for individual students. The diagnosis results will include highlighted texts associated with NGSS dimensions as evidence for specific reasoning patterns. We anticipate teachers can use this information to adjust their instruction to address individual students' needs, group them for peer discussion, and further probe additional evidence if highlighted evidence of the reasoning pattern diagnosis is not convincing. We are planning to work with teachers to design multiple approaches to leverage the automated diagnosed information. Hope this helps answer your question and let us know if you have suggestions from your work as well. Thanks!
Further posting is closed as the event has ended.