We had two excellent speakers for our Seminar on 18th March, entitled “Search and you will find?” Karen Blakeman and Tony Hirst. The question mark in the title was deliberate, since the underlying message was that search and discovery might sometimes throw up the unexpected.
Learning objectives for the day were:
- To understand the commercial, social and regulatory influences that have (or will) influence Google search engine results.
- To be able to apply new search behaviours that will improve accuracy and relevance of search results.
- An appreciation of data mining and data discovery techniques and the risks involved in using them, as well as the education and skills required for their disciplined and ethical use
Karen Blakeman delivered an informative and thought-provoking talk about our possibly misplaced reliance on Google search results. She discussed how Google is undergoing major changes in the way it analyses our searches and presents results, which are influenced by what we’ve searched for previously and information pulled from our social media circles. She also covered how EU regulations are dictating what the likes of Google can and cannot display in their results.
Amongst many examples that Karen gave of imperfect search results, this one of Henry VIII’s wives stood out – note the image of Jane Seymour, where Google has sourced the image of the actress Jane Seymour.
This is an obvious and easily spotted error, others are far subtler, and probably go unnoticed by the vast majority of search users. The problem, as Karen explained, is that Google does not always provide attribution for where it is sourcing its results, and where attribution is provided, the user must (or should) decide whether this is a reliable or authoritative source. Users beware if searching for medical or allergy symptoms; the sources can be arbitrary and not necessarily from authoritative medical websites. It would appear that Google’s algorithms decide what is scientific fact and what is aggregated opinion!
The clear message was to use Google as a filter to point us to likely answers to our queries, but to apply more detailed analysis of the search results before assuming the information is correct.
Karen’s slides are available at: http://www.rba.co.uk/as/
Tony Hirst gave us an introduction into the world of data analytics and data visualisation and challenges of abstracting meaning from large datasets. Techniques such as data mining and knowledge discovery in databases (KDD) use machine learning and powerful statistics to help us discover new insights from ever-larger datasets. Tony gave us an insight into some of the analytical techniques and the risks associated with using them. In particular, if we leave decision making up to machines and the algorithms inside them, are we introducing new forms of bias that human decision makers might avoid? What do we, as practitioners need to know in order to use these tools in a responsible way?
As Tony explained, the most effective data analysis comes down to discovering relationships and patterns that would otherwise be missed by looking at just one dataset in isolation, or analysing data in ranked lists. Multifaceted data analysis, using – for example – datasets applied to maps, can give unique visualisations and more insightful sense making.
Amongst many other techniques, Tony discussed Concordance Correlation, Lexical Dispersion, Partial (Fuzzy) String Matching and Anscombe’s Quartet.
Tony’s slides will be available at: http://www.slideshare.net/psychemedia
Following the keynote presentations from Karen and Tony, the following questions were put to the delegates:
- How can organisations ensure their staff is using (external) search engines effectively?
- How do you determine the value of search in terms of accuracy, time, and cost?
- If I wanted to know how to use data visualisation and data analysis tools, where do I go? Who do I ask?
The delegates moved into three groups to discuss and respond to these questions (one group per question). The plenary feedback as follows:
Group 1 – How can organisations ensure their staff is using (external) search engines effectively?
- Ban them from using Google
- More training
- Employ specialists to do research
- Use subscription services
- Change the educations system.
Group 2 – How do you determine the value of search in terms of accuracy, time, and cost?
- Cost and Time are variable
- Accuracy is the most important criterion
- Differentiate between “value” and “cost”
Group 3 – If I wanted to know how to use data visualisation and data analysis tools, where do I go? Who do I ask?
Lastly, we’d like to thank our speakers and the delegates for making this such an interesting, educational and engaging seminar.
Karen Blakeman (@karenblakeman) is an independent consultant providing a wide range of organisations with training, help and advice on how to search more effectively, how to use social and collaborative tools for research, and how to assess and manage information. Prior to setting up her own company Karen worked in the pharmaceutical and healthcare industry, and for the international management consultancy group Strategic Planning Associates. Her website is at www.rba.co.uk <http://www.rba.co.uk/> and her blog at www.rba.co.uk/wordpress/<http://www.rba.co.uk/wordpress/>.
Tony Hirst (@psychemedia) is a lecturer in the Department of Computing and Communications at the Open University, where he has authored course material on Artificial Intelligence and Robotics, Information Skills, Data Analysis and Visualisation, and a Data Storyteller with the Open Knowledge School of Data. An open data advocate and Formula One data junkie, he blogs regularly on matters relating to social network analysis, data visualisation, open education and open data policy at blog.ouseful.info