Data Wrangling

File naming conventions

A file naming convention is a framework for naming your files that describes what they contain and how they relate to other files. You should establish a convention and naming practice that aligns with your discipline's standard before you begin collecting data files.

File names should:

  • be machine-readable, human-readable, and work well with the default ordering
  • consistently named
  • have short but descriptive filenames (<25 characters)
  • use underscores in place of spaces, dots or slashes
  • Use date format ISO 8601: YYYYMMDD
  • include a version number if applicable.

Good file naming examples

Poor file naming examples





Test data 2016.xlsx

Meeting notes Jan 17.doc

Notes Eric.txt

Final FINAL last version.docx

Folder structure

A logical folder structure will allow you to access your files. Before you start, check for any established procedures within your team or department. One best practice is to structure folders hierarchically and be consistent with names. Whether you are storing your data in cloud storage, local drives, or network locations, a well-planned, logical, and consistent folder and file structure will help to make the data findable and reusable in the future.

Excel for data analysis

These resources are available to learn data analysis with Excel.

  • Microsoft Analyse data in Excel - This resource introduces the Analyze Data functionality that is available to Microsoft Office 365 users.
  • Excel for researchers training - This one-day course is conducted regularly by WSU partner Intersect. The course will teach you to use Microsoft Excel to import, sort, filter, copy, protect, transform, summarise, merge, and visualise research data. Check the Research Events calendar for upcoming sessions.
  • LinkedIn Learning Excel data analysis - This self-guided learning course provides instruction on using the data analysis and visualisation tools built into Excel.

Training sessions and workshops

Intersect Online Training: Introduction to Research Data Management at Western Sydney - This two-hour workshop is ideal for researchers who want to know how to create a research data management plan. Check the Research Events calendar for upcoming sessions.

Western Sydney University conducts regular research training sessions including data manipulation and visualisation using R and Python, Excel for data analysis, surveying with Qualtrics and REDCap, text analysis with NVivo, and preparing your research data management plan. Register for an upcoming event via the Research calendar.

Self-guided learning

School of Data is a series of modules to learn how to get the most out of your data. It includes self-guided learning course on collecting, extracting, cleaning, mapping and data.

LinkedIn Learning has a range of Python and R tutorials for data handling. Use your WSU credentials to log in.

Data Carpentry offers a hands on course to learn about data organisation and some practices for more effective data wrangling.

Research Data MANTRA online course

MANTRA is a free online course designed for researchers or others who manage digital data as part of a research project.

Through a series of interactive online units you will learn about terminology, key concepts, and best practice in research data management.

If you need further assistance, forms are available in WesternNow for both staff researchers and HDR students for research data management advice.