INFS7410 Practicals
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
INFS7410 Practicals
Week 1 Instructions
1. Download document collection |
|
Download the document collection into this weeks prac folder with this following links: |
|
|
option1: https://msmarco.blob.core.windows.net/msmarcoranking/collection.tar.gz option2: https://infs7410.uqcloud.net |
Choose the one that has the fastest download speed with your internet (If you are in UQ, probably option2 would suit you best). Note, this will take 4-5 Gb of space (after unzipping), plus you will need to create an index: we recommend you have at least 10 Gb of space free in your drive. After you have downloaded the data, unzip it into the same folder. While you download the collection, let's follow the instructions below to set up the python conda environment for this course. 2. Environment setup |
Note: if you are using vpn, please disconnect your vpn during the setup.
Setting up the environment is crucial, as we rely on this environment for all subsequent
practicals and assignments. So ask for help if you get any stuck on errors during the setup. Work through the following steps one by one:
1. We use anaconda to manage our environment. Follow the instructions on the official website to download and install anaconda. Note, the installation may vary depending on your system (e.g., Windows vs macOS): make sure you follow the correct one. After you finish the installation, open the terminal and type conda list to verify the installation.
2. Type conda create -n infs7410 python=3.7 to create an environment for this course, we use python3.7.
3. Type source activate infs7410 for Linux/macOS, or
conda activate infs7410 for MS Windows, to activate your environment, you should see (infs7410) at the front of your command line.
4. Install JDK 11 via conda: conda install -c conda-forge openjdk=11 . For Windows ONLY, update the PATH environment variable to ensure the java jdk is linked:
conda env config vars set PATH="%PATH%;%CONDA_PREFIX%\Library\bin\server"
then reactive the environment:
conda activate infs7410
5. Install the python toolkit we use in this course (and in many of our research projects), called Pyserini (to indicate it is a port to Python of the original Java codebase Anserini); do this by typing pip install pyserini==0.13.0 . Note, this is not a commercial search engine application, but is very popular in the information retrieval research community; yet, is based on the Lucene codebase, which is at the core of many commercial search engine products, including Elasticsearch. More information about this toolkit can be find here.
6. Now, let's install python jupyter notebook: we will use notebooks for most of the practicals and assignments. First, use the command line and the command cd to get into the tutorial folder. Then pip install notebook . Finally, type jupyter notebook , this will open the notebook in your browser. Now lets move to the notebook, click prac-week1.ipynb and see you there.
2022-08-29