Text mining for technology foresight (“text mining”) is a form of “content analysis” with some variations. It exploits text and numerical data source of various sorts, especially those related to “technology intelligence”. Such intelligence is the prime requirement for effective technology management as organizations operating in competitive and/or collaborative environments must track information on external technology developments. It underpins Future-oriented Technology Analysis (FTA), which encompasses key aspects of technology forecasting, assessment, foresight, and roadmap development.
Therefore, text mining focuses on extraction of useful intelligence from electronic text sources, both publicly available and organization-confidential databases. The method serves FTA interests in three major ways:
- Identifying R&D emphases that facilitate the process of future developments;
- Providing time series for trend extrapolation and growth modeling;
- Determining “innovation indicators” that pertain to the prospects for successful technology applications.
Text mining is also considered to be closely related to more well-known methods, such as “technology monitoring”, “environmental scanning”, and “literature review”. Embracing the concepts altogether, text mining should then be able to scan and digest a broad range of related literature sources as well as raw data and information in order to identify developmental patterns, key events, and significant changes in various environments.
Approach & Methodology
In practice, the text mining is usually an 8-step process as follows:
- Step 1: Define key questions to be addressed: The questions must relate to management of technology issues. They will act as guidelines for people to make choices about which data sources to use and which analyses to pursue. This will lead to focus in data gathering and thus effectiveness in answering the questions.
- Step 2: Obtain suitable data: Hundreds of available databases can be utilized to extract data to fulfill technical intelligence and research needs.
- Step 3: Make the searches: The inquiries must be properly bounded from the beginning so that a “quick and raw” initial search can be made effectively. The abstract results of the search would then be put into the quick analysis in the next step.
- Step 4: Import into text mining software: Initial search results could be imported into TechOASIS, text-mining software, for further particular analyses of field-structured text records.
- Step 5: Clean the data: Data cleaning, also called “clumping”, is carried out to consolidate name variations from multiple databases. Relevant software tools would be helpful to the aggregating process.
- Step 6: Analyze and interpret: Analyses in many paths and forms should help answer the essential questions driving the analysis. After the basic operations being done by text mining software, analyses and interpretations will then follow to allow some easier exploration. Also note that some of the other operations can also be done without text mining software, e.g. website database search engines.
- Step 7: Communicate and present information: The prerequisite of this step is to know how the key users like to receive technical intelligence. Reporting styles may be combined in various ways, including oral and written, electronic and paper, text tables and figures (e.g. video or animations), and interactive exchange (e.g. workshop). Explanations should be directed to the audience with the utmost level of efficiency.
- Step 8: Standardize and semi-automate: Rapid Technology Intelligence Process (RTIP) could be advocated in standardizing and automating text mining analyses. With the help of RTIP, technology managers and professionals become more familiar with the text mining outputs. Analyses can also be done much faster and cheaper as fewer resources are utilized to generate more value.
Text mining certainly consists of strengths and weaknesses. The method is able to answer four types of questions: who, what, where, and when. On the other hand, the other types of questions, how and why, will almost always require the method blend with some expert opinion to infer processes (how?) and reasons (why?). This reflects the fact that the method, like all others, never stands well alone. For text mining, it also needs to be combined with other methods, particularly expert opinion. High involvement of substantive experts in the process is essential. Basic tabulations must come with thoughtful interpretation. Relevant “innovation indicators” require complementary expert opinion to test observations and relationships generated from the information resources. In sum, combining text mining results with expert opinion will allow strengths of each side to take into effect.
Nowadays, the text mining technique is admittedly new to technology managers and professionals as well as futurists. Hence, it is crucial that certain groundwork be laid out to prepare for its effective application. The to-do list that could be considered to be carried out is as follows:
- Development of user-friendly database: putting information products into comfortable and easily grasped by users is pivotal. A number of R&D databases useful for technology foresight should be set up. Blending numerical tabulations with graphical depictions and text interpretations should be done in a way that tailors to audience’s preferences. An interesting option is to merge a report with a CD containing the raw abstracts and the mining software. A highly recommended approach in tailoring information products cautiously to key users’ needs is to put findings into “packages” to avoid information overload.
- Training workshops: training is certainly mandatory for text mining to be conducted effectively. Motivated analysts must obtain support via ongoing access to advice from experts. Furthermore, given some investments in certain application software, e.g. VantagePoint, development of in-house capabilities can be carried out to enable internal people with their access to suitable electronic information resources, followed by implementation of tools to search, retrieve, and analyze selected records.
- Building relationships: establishing strong relationships between analysts and users is one of the important required efforts to build familiarity with text mining outputs and use them effectively. Three factors exist to enhance the prospects: 1) facilitating direct links between the analysts and users; 2) engaging multiple analysts and multiple technology decision makers to develop a robust, learning network; and 3) directing attention to successes so that the organization appreciates the value gained.
In conclusion, text mining has a role to play in comparing and assessing the research outputs of various organizational groups. National foresight studies can exploit text mining to compare national R&D outputs with those of benchmark countries, or with their national economic development targets. The most promising outlook as a result of using the method includes the advance in the interpretable innovation indicators that benchmark technological progress and the advent of “macros” (Visual Basic scripts) that control sequences of steps in relevant text mining software.