Preparing data
Authors of this page: Celia Taylor and Graham R. Gibbs
Affiliation: University of Huddersfield
Date written: 30th June 2005
Before you can begin the analysis of the data you have collected, it has to be in a format suitable for analysis. The most common form of collecting data in qualitative research is interviews which are then transcribed into text data.
If you think you may be using a CAQDAS package please refer to the General Transcription Guidelines as the data has to be imported in a certain format.
Video data
A very common issue when using video data is how to convert your video (which could be in any one of a variety of analogue or digital formats) into a digital video file that software like Transana or Atlas.ti recognizes, ideally MPEG-1 or MPEG-2. This page contains a few links to issues you need to think about.
Transcription
After an informant is interviewed the researcher needs to decide the most suitable format for analysis. Usually it is typed up rather like a script for a play (see Figure 1). Further details about the format of a transcript are below.
Please note that line numbering is only required by some theoretical approaches and should be NOT be used on transcripts to be imported into a CAQDAS packages. See further details at Line numbering in CAQDAS.

Figure 1. Example of part of a transcribed interview.
Two short videos from a lecture on transcribing
Why transcribe
It is not compulsory to use a transcript of an interview or field notes to undertake an analysis. Some forms of analysis can be done direct from the recording. Doing this could help you focus on what is going on and not get too focused on detail of what people have said. However, for methodologies such as discourse and conversation analysis, the focus is exactly on what and how people have said things, so a detailed transcript is a necessity. A transcript can be shared within a team.
Semi-transcription
It may not be necessary to analyse everything the informant said or everything they said in the same detail. In this case you can transcribe parts of the interview and write notes on the rest. This is quicker and cheaper than a full transcription. It will also keep the researcher focused on the bigger picture. The down side is that the transcribed parts can be out of context and be difficult to interpret without constant reference back to the tape or the video. Also, what you think is significant at the time of transcribing may not be what you think later on in the analysis.
Detail and level of transcription
Few people speak in grammatical prose and many have speech habits such as "you know" so the researcher needs to decide what needs to be transcribed from the recording. This will also be determined by the methodological approach. Some require more precise detail than others.
If you have arranged for somone else to transcribe your recordings for you, discuss the level of transcription with them before they start work. Make it clear what level you require. Very often typists are trained to 'tidy up' the speech they hear on the recordings, but that may not be what you want.
Semi transcription
I wouldn't leave the kids when they were young in case something happened. My grandmother offered to look after them, but her health was not good.
Full transcription
I wouldn’t leave the kids with somebody when they were young. I wanted to go out with my friends but I was worried about what might happen while I was out. My Grandmother offered to look after them but I felt it was too much for her as her health was not good.
Colloquial transcription
I would’ne leave the kids wi’ som’d’y when they were young. I wanted to go oot wi’ my friends but I wis waryied aboot what meeght happen while I wis oot. My Granmither offered to look after them but I felt it wis too much for her as her health was nae good.
Time needed for transcription
It is estimated that someone who can touch type will take 4 to 6 times the length of the recording to transcribe it. So a 1 hour tape will take at least 4 hours to 6 hours. A poor recording or slow typist will increase transcription time significantly.
A researcher transcribing their own interviews needs to consider the time involved when planning their project as it is time consuming. But it also gives the researcher an opportunity to really get to know the content on the interview and can form part of the analysis as new questions or issues arise during the chore of transcription.
Employing transcribers
If funds are available the researcher could employ an audio typist to transcribe the interviews this is normally priced per hour or per set number of words. This saves the researcher time but they will still need to check the transcription, once it is completed, against the tape. Also the transcriber will be unlikely to know which areas are significant and so will need to do a full transcription. A researcher doing their own transcription could decide which parts of the interview need to be transcribed and those parts that are off-topic.
Accuracy/Checking transcription
It essential to check the transcription against the tape to see if the interview has been transcribed correctly. It is easy for words and phrases to mis-heard for instance 'CAQDAS' can sound like 'Cactus'
Transcription equipment
There are several different methods of transcription. The traditional approach is audio cassette and a transcription machine. A more recent approach is digital recording on an MP3 recorder to produce MP3 audio files and and the use of transcribing software.
See the links in the sidebar for more on digital recording and recording equipment and transcribing.
Transcription software
Transcriber 1.4.2
Transcriber is a free software package for computer based transcription of an audio recordings. The software allows you to synchronise the transcript with the audio recording. By manually segmenting and labeling speech turns and topic changes. This helps with checking the accuracy of a transcript and also allows you to search for a section by topic.

Figure 2. Screen shot of the software Transcriber 1.4.2
Speech recognition software
Voice recognition software means that you can speak to your computer and it will convert you voice into text. However, at the moment the software is only able to recognise one voice. It also requires time to train the software to recognise voice. This means it is not possible to play an interview recording of a respondent and get it transcribed by the computer. However, a researcher can train the software to recognise their voice. They can then listen to their recording of the respondent using headphones and speak what they hear into the microphone connected to the computer. This is a method often used by researchers who are slow typists. There are currently two easily available programs: Naturally Speaking and Via Voice. Both programs are helped by being run on a fast computer with lots of memory.
Dragon Naturally Speaking
This program from ScanSoft is a PC program for voice recognition. Talk to your computer and your words instantly appear in most Windows based programs including Microsoft® Word. Dictate directly into a PC or any Scansoft-approved handheld digital recorder. A background noise canceling microphone is included with software.
IBM ViaVoice
ViaVoice® is available on the Windows, Macintosh (including Mac OS X) and handheld computer platforms. It has very similar functions to Naturally Speaking. It too, is now available from and supported by ScanSoft.
Format of a transcript
CAQDAS package
Please refer to General Transcription Guidelines if you will be using a CAQDAS package.
Line numbering in CAQDAS
Do not try to import a transcript into a CAQDAS package with line numbers. If you want to use line numbers you need to create them within the CAQDAS package. None of the CAQDAS packages will create line numbering which will match a Word version (as line lengths will be different). Most CAQDAS packages will create paragraph numbers in reports (files generated in a CAQDAS package). If you must have line numbers for your analysis then get some advice as it may mean a CAQDAS package is not suitable.
Printing out transcripts
It you are using a CAQDAS package but also want to look at a printed version of a transcript it is best to print them after you have imported them into the CAQDAS package so you are working in a consistent format.
Printing out your transcripts allows you to read them anywhere, there are times and places where computers are cumbersome. It is also 25% quicker to read text printed on paper than on a screen. However, any coding you do on paper will need to be input into the software.
Manual (pen paper) analysis
1. Put who is speaking in capitals then a tab and what they said. See Figure 3. (Capitals helps with searches later).
2. Ensure you spell the names the same way through out the transcript.
3. Use INT-NAME or I-NAME or INTERVIEWER (without the the name if only one person did all the interviews) to indicate the interviewer is speaking.
4. Put two carriage returns between each speaker makes it easier to see who is talking.
5. It is useful to have space on your printed transcript for writing notes and coding ideas. Do this by having wider margins and double spacing the text.
6. Line numbers are useful for cross referencing although not all approaches use them. See 'To show line number in MS Word' below.

Figure 3. Elements of a transcribed text.
To show line number in MS Word:
On a Windows PC:
- Click on File in the top menu and Click Page Setup
- The 'Page setup' dialog window appears. Click the Layout tab and then Click the Line numbers... button near the bottom of this dialog.
- A small 'Line Numbers' dialog window appears (see Figure 4.) Click Add line numbering tick box. Select the number you want the line numbering to start at and the number of lines you want to count by (Usually 1 and 1). Finally, select how you want to number the document. Normally this would be Continuous. Click the OK button in the Line Number dialog and then Click the OK button on the Page Set Up window. The line numbering should appear on your document
On a Macintosh:
- Click Format:Documents then Click Layout tab. Click Line Numbers button then Click the Add line numbering check box and Click Continuous radio button.
N.B. the line numbers are only visible in Page Layout View.
Figure 4. Line number dialog box.

