New Statistical Software Consultant

Patricia PoseyPatricia Posey, our new statistical software consultant, begins this coming week, offering appointments to provide assistance with statistical software, including R, Stata, and SPSS. She welcomes questions about proper commands, data visualization, and regression analysis. She cautions that her assistance is not intended to help students decide on the suitability of a given statistical method for their research, pick which datasets to use, or interpret results that should be based on the researcher’s ideas and discipline-specific expectations.

Patricia is a 4th year doctoral candidate in Political Science, where she specializes in American Politics. Her research investigates how financial services influence the political engagement and political attitudes of racial and ethnic minorities. Her statistical experience covers analysis of a variety of social science data sets. She obtained a Bachelor of Arts Degrees in Political Science and Sociology with a minor in Latin American Studies from the University of Florida in 2013.

Patricia will be offering statistical software help by appointment on Monday and Tuesday afternoons at WIC in room 116, starting on January 30th. Use the online scheduler to view her availability and request an appointment.

laptop computer displaying a news website with the heading "fake news"

Information Literacy Workshops

The phenomenon of fake news has become a hot topic, ironically, of major news outlets in recent months. News stories are being presented as fact without any substantial backing in truth. There are many reasons why fake news happens and is promulgated. They vary from personal monetary gain to accidental, well-intentioned spread of misinformation.

With so many reasons tempting so many people to promulgate fake news, how do you know what sources to trust? How do you know the supposed rise of fake news isn’t merely a fake news story itself, anyway? Penn Libraries can help with that.

During the month of January, Penn Libraries will be offering a three-part Information Literacy Workshop series about evaluating news sources. Each workshop will highlight a different kind of misinformation while preparing participants to recognize and mediate false information in their own news consumption.

A workshop entitled Fake News: Pinpointing Lies, Hoaxes, and Conspiracy Theories will kick off the series and takes place on Wednesday, January 18, 2017 from 3-4:30pm in the Weigle Information Commons Seminar Room. This installation focuses on evaluating false information.

The next two workshops feature strategies for identifying Slippery News and Shoddy News – distinctions that have recently become necessary. In brief, slippery news refers to stories that aren’t meant to maliciously deceive but are hotbeds for misinformation. The shoddy news workshop, on the other hand, will link news reports of research to the research itself in an attempt to decipher which stories are sourced with verifiable research and which utilize papers with unsound methodologies.

Attending any one of these workshops can help you sift through the massive amounts of ambiguous information available on the internet everyday. Attending the workshop as a series will give you nuanced insight into the different types of unreliable information out there and provide you with tools to think critically and avoid consuming that misinformation.

Penn and the Surrounding Community

On the edges of the Van Pelt Collaborative Classroom, located just down the hall from the Weigle Information Commons,  an exhibit about the edges of Penn’s presence in West Philadelphia runs until Friday, February 24, 2017. Penn and the Surrounding Community is a collection of work by Dr. Rosemary Frasso‘s students from the SW781/PUBH604 class entitled Qualitative Research in Social Work and Public Health. This semester’s exhibit focuses on how undergraduate and graduate students here at Penn conceptualize the University’s impact on its urban setting.

Nominal Group Technique (NGT) (in which members of a group name, then rank items) was used to determine the topic of exploration for the class research study. Briefly, Dr. Frasso moderated a session where in the students suggested potential topic ideas, then ranked those ideas. The topic of Penn and the Surrounding Community was collectively chosen as the central theme for investigation.

First, the students collected free-listing data. Each of the 25 students in the class recruited 5 participants (total of 125 people) from the Penn community and asked them to share the words that come to mind when they think about Penn’s relationship with the surrounding community. These data (words generated) were then analyzed to determine the salient domains.

Then each student recruited one additional participant to take part in the Photo-elicitation arm of the study. Briefly, each participant was asked to think about Penn’s relationship with the surrounding community and using their camera or smartphone to take photos that would help them explain their impression of this relationship. The photos were then used to guide a qualitative interview. All interviews were recorded and transcribed verbatim and analyzed in the Collaborative Classroom.

The preliminary analysis yielded 10 thematic categories: Benefits, Safety, Permeability, Double-Edged Sword, Accessibility, Responsibility, Exclusivity, Bubble, Boundary, and Penntrification. Within these broad categorizations, representative photos and their accompanying captions were chosen for exhibition. The finished product will ultimately include an abstract for presentation as well as a manuscript for publication in addition to these preliminary findings currently on exhibit. The project can be viewed on the Scholarly Commons’ New Media Showcase.

The photos and quotes paint a complicated picture of how students perceive Penn’s relationship with the West Philadelphia community. The work highlights both the beneficial nature and drawbacks that are byproducts of Penn’s presence in West Philly, best described as a “double-edged sword.” For thought provoking insights like these, the exhibit is an enlightening and self-reflective project that is well worth the visit. Research rigor and critical social immersion blend to demonstrate the strengths of research in Public Health and Social Work.

 

A Chilly Winter Break at WIC

rtmibkmcosw-nathan-wolfe

With finals upon us and winter break quickly approaching, we wanted to inform our patrons that we’ll be having some work done at WIC over winter break. From December 27th through 30th, the library duct systems will be cleaned, which means that WIC will be chilly, noisy, and possibly overtaken by workers in certain booths and rooms. Even though WIC will be open (but not staffed) those days from 8:30am to 5pm, we’ve taken all of the bookings for the booths, group study rooms, and the WIC Seminar Room offline in case our workers need them. Anyone using WIC is more that welcome to sit in a booth or room, but please know that if the workers need access to any areas, they may ask you to move.

Please feel free to get in touch with us at wic1@pobox.upenn.edu with any questions. We wish everyone a happy holiday and restful winter break!

Getting your audio/video transcribed

This post is adapted from an email I wrote in response to a question about the best way of obtaining a transcription of an audio file.

Good transcriptions/captions are incredibly useful in a variety of situations, and due to ADA compliance, they’re increasingly a necessity. People usually don’t think about this ahead of time, and I try to encourage people to build captioning into research budgets and grant applications whenever possible because costs add up. The more footage you have, the more likely you’re going to have to get someone else to do it, and even just 10 hours of audio could cost you $1000 to have transcribed by a captioning service.

Some of you may be tempted to rely on YouTube’s automatic captions. By way of example, here’s a video we put up where all of the speakers speak quite clearly:

https://www.youtube.com/watch?v=J93E5s0yHxM

But (as of late 2016) the quality of the YouTube automatic captions—although clearly they’ve made huge progress over the years—still means that they serve no real purpose other than their comedic/entertainment value. They’re good enough only to get a very general idea of what’s going on, and that’s about it. And this is with clean audio and clear speakers with a standard American English accent.

  • It’s not accurate enough for ADA compliant captions or for hearing impaired people to find useful.
  • It’s not accurate enough for a native English speaker to watch the video with the sound off.
  • It’s not accurate enough for non-native English speakers to use increase comprehension or to use with automatic translation services.
  • It’s not accurate enough for a production transcript for an editor to find clips to use.
  • It’s not accurate enough to provide useful search capability.
  • It’s not accurate enough as an alternate way of archiving audio content.
  • It’s not accurate enough to use the transcriptions in a thesis, dissertation, or journal article.
  • It’s not accurate enough to do a qualitative analysis of the text.
  • It MIGHT be accurate enough for some degree of SEO, but it’s certainly not ideal.
  • It’s inaccurate enough that if you’re going to take these captions as a starting point and then go back and edit them, you’re not really saving yourself much time.
  • Inaccurate captions can also detract from the user experience because users end up focusing on the errors instead on your content.
  • It’s inaccurate enough that it makes it difficult to impossible to repurpose the text to other contexts (blog posts, tweets, emails, etc.).

The best transcription software out there still works best when it’s had a chance to learn a particular speaker’s voice, which takes time and means you have to correct the software as you go so it can learn from its mistakes. This is fine when the same person is transcribing their own voice over and over again, but it’s not so useful for just a handful of interviews of each speaker.

I say all of this not to put down YouTube (again, I’m actually really impressed it’s as accurate as it is) but in support of the idea of paying human beings to transcribe it for you—preferably people who are experienced in doing so, but almost any person is going to do a better job than software.

Whether you’re going to hire a service or pay an undergrad to type something up for you, some things to consider, all of which can help determine which route you take:

  1. The fairest way to compare services is to be sure you’re paying per minute of interview, not per minute of time spent transcribing, which will vary from person to person.
  2. Are volume discounts available?
  3. Are educational discounts available?
  4. Try to find a service which guarantees a certain level of accuracy (generally, it’s not going to be usable for most purposes if it’s less than about 97% accurate). Is the provided quality/level of accuracy good enough for your needs? Is it good enough to attach your name and Penn’s name to the final product?
  5. Do you need just a transcript? Or timed captions?
  6. Do you want an “interactive transcript” like what com does with their instructional videos?
  7. Find out what output formats they provide. (is it just straight text in a .docx file w/ a periodic time code stuck in? Timed captions SRT? DXFP/TTML?) The degree of accuracy you need for the timing of the text will partly determine what file format you need. Some are convertible to others.
  8. Some services will transcribe a few videos for free first to see if you’re happy with the service.
  9. How fast is the turnaround time they offer? (Generally you pay less for slower turnaround, but it can be useful to be able to pay extra when you need it the next day) A service is going to provide much faster turnaround time than an individual can because they have many transcribers working for them.
  10. Does your school have an existing relationship with a captioning service?
  11. Do your captions need to be ADA compliant? (Both Penn State and Netflix have had lawsuits against them because of the lack of captioning. Check with your School/center/department to see if there’s a policy regarding captioning you’ll need to follow.)
  12. Do you need a HIPAA compliant service or is the material otherwise sensitive or confidential?
  13. Can you build the cost of transcribing into your research budget or grant proposal?
  14. Do you need all of your raw footage transcribed (as you would if you were editing a documentary)? Or just the final edited version (as you would if you were simply trying to meet ADA requirements)?
  15. Are they a Penn-approved vendor? Can you pay with a purchase order?
  16. Do you need transcription in a language other than English?  (English and Spanish  are pretty easy to find, but there are services that offer transcription in many other languages as well, sometimes at a premium cost.)

 

As far as recommended services, I’m glad to recommend both AutomaticSync and 3Play, both of which we’ve used and both of which we’ve been very happy with.