Categories
Uncategorized

CS 230 Java Programming- Search Engine Over a Collection

$20 Bonus + 25% OFF

Securing Higher Grades Costing Your Pocket?
Book Your Assignment at The Lowest Price
Now!

Students Who Viewed This Also Studied

CS 230 Java Programming

Task

IR Phase 1 Project Requirements

The overall project for this semester is to simulate a search engine over a collection (“corpus”) of documents. This project will be divided into three phases. The requirements described here are for Phase 1.

Phase 1 is broken down into various tasks which will be used in subsequent phases as well. Each Task has to be uploaded to Blackboard system by syllabus deadline in the corresponding assignment content folder for Phase1:

IR.P1.Task#1) You will need to maintain your own corpus of documents for the semester. To do so, come up with 10 neutral (ie no controversy) queries (for example: Who was the 16th U.S. President?) that you will submit to a search engine of your choice. Upload these ten queries to Blackboard system for IR.P1.Task#1.

IR.P1.Task#2) You are to then download the first 20 (non-controversial) webpage (html) responses that the search engine returns with, for each of the 10 queries (this is manually done; you have to download it one by one; fortunately, you only have to do this once).  There will be a total of 200 html files. (We will be discussing shortly in class how to process these using the Java Regex package. You may NOT use 3rd party code. You MUST write your own. You do not need regex necessarily but it does for provide much more concise code.) Place all 200 html files in a directory named Corpus and compress/zip the entire directory. Upload this to Blackboard system for IR.P1.Task#2. YOU SHOULD ASSUME (IN GENERAL) THAT NO CREDIT WILL BE GIVEN FOR SHARED FILES OR LINKS TO FILES. HOWEVER, FOR THIS TASK, if the Blackboard system limits do not allow you to upload this compressed file, then you can store it on a cloud and upload to Blackboard system a secure link to that file.

IR.P1.Task#3) Identify a Stoplist (either download or compute in a separate code on your own) and store it in a hash structure. Program code is needed for the storage of the stopword list into a hash structure and the ability to output your hash structure to an output text file. Upload the .java files necessary to accomplish this to the Blackboard system for IR.P1.Task#3.

IR.P1.Task#4) In Java code, compute an Inverted Index collectively storing info for files that are part of the corpus. See the following links for an explanation of what an Inverted Index is (and what is not, such as a forward index):

You are to use either Java hashmaps or hashtables for storing the inverted index of your corpus. (Separate email will provide tutorial links for hashtables.) What information should you store in the inverted index for each significant (ie non-stopword) found in one of your documents? a) the word; b) the name of document found in; c) a vector specifying for each occurrence of the word in a document, how many words from beginning of document was it found (for this count include even the stopwords). You need to do this for every word in every document that is not a stopword. Upload this Java Code to the Blackboard system for IR.P1.Task#4.

IR.P1.Task#5) The code for each phase has to be compiled using javac (jdk compiler) and executed using the java command (jdk runtime environment). Important names of files etc. will be provided on the command line of the java command using “flags.” Details about the usage of flags for this phase will be emailed to you separately and discussed below. Further Code issues will be explained in a separate email on Command Line Parsing. Please note that ALL phases of this project will be run from the command line only. Upload the Java Code that processes the project command line and its flags to the Blackboard system for IR.P1.Task#5.

IR.P1.Task#6) You will need to demonstrate the ability to “query” your inverted index for such information as a) does a specific word appear in any document? b) how many documents (and which) does a given word appear in; c) how many times (frequency) does a word appear in a given document. The project will need to create and utilize a -SEARCH flag in conjunction with an output flag indicating which file the output should go to: -output=OutputFileName

-SEARCH=WORD — would search the Inverted Index for the given WORD and return with which documents does the word appear in and specifically how many times appears in that document.

-SEARCH=DOC   — would  search Inverted Index for the given Document and return all words found in that DOC with specifically how many times appears in that document.

NOTE: For these commands, you may also need to pass other parameters via the command line using appropriately named flags.

Upload the Java code files that implement these functions to Blackboard system for IR.P1.Task#6.

IR.P1.Task#7) This task is predicated on Task#6 being completed. Demonstrate one example of a word search and one example of a doc search. You will upload three files to the Blackboard system (either individually or in one compressed/zipped file, BUT ALL) for IR.P1.Task#7). The first file is Searches.txt describing these searches and the actual commands the user needs on command line to run your project and achieve these searches. In addition, upload the two output files corresponding to the two searches. (Use different names for each output file.)

IR.P1.Task#8) The system should be able to printout the inverted index or other relevant information pertaining to a given document. The project will need to create and utilize a -PRINT flag in conjunction with an output flag indicating which file the output should go to: -output = OutputFileName

-PRINT_INDEX=WORD — would print all the information contained in the Inverted Index for the given WORD into the output file. The exact format is left up to you, but it must contain all of the information.

– PRINT_INDEX=DOC   — would print all the information contained in the Inverted Index for the given DOC into the output file. The exact format is left up to you, but it must contain all of the information.

You will need to upload the .java files that implement this feature with the demonstrated output files to IR.P1.Task#8 on the Blackboard system. Also, include a Print.txt file that describes which WORD and which DOC you chose to demonstrate this feature and what the actual commands the user needs on command line to run your project and achieve these printed outputs.

Place all of these files into a temporary directory on your computer system and Compress/zip the directory all as one file and upload it to the Blackboard system for Task#8. YOU SHOULD ASSUME (IN GENERAL) THAT NO CREDIT WILL BE GIVEN FOR SHARED FILES OR LINKS TO FILES. HOWEVER, FOR THIS TASK, if the Blackboard system limits do not allow you to upload this compressed file, then you can store it on a cloud and upload a secure link to that file.

NOTE: For these commands, you may also need to pass other parameters via the command line using appropriately named flags.

IR.P1.Task#9) Collect all the .java files necessary for your system into a single temporary directory on your computer system. Compress/zip the directory all as one file and upload it to the Blackboard system for IR.P1.Task#9. If you did this correctly and did not add any extraneous project or data files and just the .java files, then the compressed file to be uploaded should not be particularly large and you will be able to upload as is (and NOT as a shared or cloud file or via file link.)

IR.P1.Task#10) Comment your code appropriately and use meaningful names for classes, methods, variables and constants. You are expected to report on the number of lines of code using cloc found. This command must be run on the command line (cmd.exe). When you collect all .java files into the same temporary directory (see IR.P1.Task#9 above), then the following command will report on the contents of each .java file. The report generated by cloc program will report on the number of actual java code lines, blank lines, comment lines for each of the .java files that is part of your project. The command to obtain the report data is as follows (assuming you are using the above version):

cloc-1.92.exe –by-file *.java

This report should be copied into a text (.txt) file which you will upload to Blackboard system for IR.P1.Task#10.

CS 230 Java Programming

Answer in Detail


Solved by qualified expert

Get Access to This Answer

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Hac habitasse platea dictumst vestibulum rhoncus est pellentesque. Amet dictum sit amet justo donec enim diam vulputate ut. Neque convallis a cras semper auctor neque vitae. Elit at imperdiet dui accumsan. Nisl condimentum id venenatis a condimentum vitae sapien pellentesque. Imperdiet massa tincidunt nunc pulvinar sapien et ligula. Malesuada fames ac turpis egestas maecenas pharetra convallis posuere. Et ultrices neque ornare aenean euismod. Suscipit tellus mauris a diam maecenas sed enim. Potenti nullam ac tortor vitae purus faucibus ornare. Morbi tristique senectus et netus et malesuada. Morbi tristique senectus et netus et malesuada. Tellus pellentesque eu tincidunt tortor aliquam. Sit amet purus gravida quis blandit. Nec feugiat in fermentum posuere urna. Vel orci porta non pulvinar neque laoreet suspendisse interdum. Ultricies tristique nulla aliquet enim tortor at auctor urna. Orci sagittis eu volutpat odio facilisis mauris sit amet.

Tellus molestie nunc non blandit massa enim nec dui. Tellus molestie nunc non blandit massa enim nec dui. Ac tortor vitae purus faucibus ornare suspendisse sed nisi. Pharetra et ultrices neque ornare aenean euismod. Pretium viverra suspendisse potenti nullam ac tortor vitae. Morbi quis commodo odio aenean sed. At consectetur lorem donec massa sapien faucibus et. Nisi quis eleifend quam adipiscing vitae proin sagittis nisl rhoncus. Duis at tellus at urna condimentum mattis pellentesque. Vivamus at augue eget arcu dictum varius duis at. Justo donec enim diam vulputate ut. Blandit libero volutpat sed cras ornare arcu. Ac felis donec et odio pellentesque diam volutpat commodo. Convallis a cras semper auctor neque. Tempus iaculis urna id volutpat lacus. Tortor consequat id porta nibh.

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Hac habitasse platea dictumst vestibulum rhoncus est pellentesque. Amet dictum sit amet justo donec enim diam vulputate ut. Neque convallis a cras semper auctor neque vitae. Elit at imperdiet dui accumsan. Nisl condimentum id venenatis a condimentum vitae sapien pellentesque. Imperdiet massa tincidunt nunc pulvinar sapien et ligula. Malesuada fames ac turpis egestas maecenas pharetra convallis posuere. Et ultrices neque ornare aenean euismod. Suscipit tellus mauris a diam maecenas sed enim. Potenti nullam ac tortor vitae purus faucibus ornare. Morbi tristique senectus et netus et malesuada. Morbi tristique senectus et netus et malesuada. Tellus pellentesque eu tincidunt tortor aliquam. Sit amet purus gravida quis blandit. Nec feugiat in fermentum posuere urna. Vel orci porta non pulvinar neque laoreet suspendisse interdum. Ultricies tristique nulla aliquet enim tortor at auctor urna. Orci sagittis eu volutpat odio facilisis mauris sit amet.

Tellus molestie nunc non blandit massa enim nec dui. Tellus molestie nunc non blandit massa enim nec dui. Ac tortor vitae purus faucibus ornare suspendisse sed nisi. Pharetra et ultrices neque ornare aenean euismod. Pretium viverra suspendisse potenti nullam ac tortor vitae. Morbi quis commodo odio aenean sed. At consectetur lorem donec massa sapien faucibus et. Nisi quis eleifend quam adipiscing vitae proin sagittis nisl rhoncus. Duis at tellus at urna condimentum mattis pellentesque. Vivamus at augue eget arcu dictum varius duis at. Justo donec enim diam vulputate ut. Blandit libero volutpat sed cras ornare arcu. Ac felis donec et odio pellentesque diam volutpat commodo. Convallis a cras semper auctor neque. Tempus iaculis urna id volutpat lacus. Tortor consequat id porta nibh.

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Hac habitasse platea dictumst vestibulum rhoncus est pellentesque. Amet dictum sit amet justo donec enim diam vulputate ut. Neque convallis a cras semper auctor neque vitae. Elit at imperdiet dui accumsan. Nisl condimentum id venenatis a condimentum vitae sapien pellentesque. Imperdiet massa tincidunt nunc pulvinar sapien et ligula. Malesuada fames ac turpis egestas maecenas pharetra convallis posuere. Et ultrices neque ornare aenean euismod. Suscipit tellus mauris a diam maecenas sed enim. Potenti nullam ac tortor vitae purus faucibus ornare. Morbi tristique senectus et netus et malesuada. Morbi tristique senectus et netus et malesuada. Tellus pellentesque eu tincidunt tortor aliquam. Sit amet purus gravida quis blandit. Nec feugiat in fermentum posuere urna. Vel orci porta non pulvinar neque laoreet suspendisse interdum. Ultricies tristique nulla aliquet enim tortor at auctor urna. Orci sagittis eu volutpat odio facilisis mauris sit amet.

Tellus molestie nunc non blandit massa enim nec dui. Tellus molestie nunc non blandit massa enim nec dui. Ac tortor vitae purus faucibus ornare suspendisse sed nisi. Pharetra et ultrices neque ornare aenean euismod. Pretium viverra suspendisse potenti nullam ac tortor vitae. Morbi quis commodo odio aenean sed. At consectetur lorem donec massa sapien faucibus et. Nisi quis eleifend quam adipiscing vitae proin sagittis nisl rhoncus. Duis at tellus at urna condimentum mattis pellentesque. Vivamus at augue eget arcu dictum varius duis at. Justo donec enim diam vulputate ut. Blandit libero volutpat sed cras ornare arcu. Ac felis donec et odio pellentesque diam volutpat commodo. Convallis a cras semper auctor neque. Tempus iaculis urna id volutpat lacus. Tortor consequat id porta nibh.

38 More Pages to Come in This Document. Get access to the complete answer.

More CS 230 CS 230 Java Programming: Questions & Answers

Business Management

The purpose of this assignment is to discuss on some important characteristics of Platform Technologies and their functioning.a.    Describe basic computer structure and operation; b.    List factors that may affect computer performance, diagnose basic computer problems …

View Answer

COMP 132 Computer Science

 For the first one I was looking to add a prompt that either lets the user choose what symbols to use for drawing, or lets them choose whether they would like a right or left triangle of stars. (or both even). Here are the instructions: Start by creating some code to print a single line of aste …

View Answer

Professional Issues

Analysis – Mini Case Study 11.1. Legal factors

View Answer

CISC 2200 Data Structures

Question:
Constellations
The stars of the night sky for known constellations. This is a challenging problem and you are welcome to use any data structures. algorithms, and techniques we have learned this semester. We hope that you experiment with using hash tables as one tool for this problem. Bel …

View Answer

Content Removal Request

If you are the original writer of this content and no longer wish to have your work published on Myassignmenthelp.com then please raise the
content removal request.

Choose Our Best Expert to Help You

Still in Two Minds? The Proof is in Numbers!

38983 Genuine Reviews With a Rating of 4.9/5.

Business Law

Assignment: 1 Page, Deadline:
20 days

Good work but there missing w6 answer. moreover i think few questions required explanation i think can explain little more.

User ID: 8***63 Australia

Civil Law

Home Work: 6 Pages, Deadline:
7 days

Assignment delivered before the deadline. Great work by the team and appreciate the support.

User ID: 8***13 Canberra, Australia

Economics

Assignment: 3.2 Pages, Deadline:
3 days

The assisgnment was properly written with great content and structure. Thank You for the assistance.

User ID: 1***22 Canberra, Singapore

Management

Home Work: 2 Pages, Deadline:
9 days

I am glad to use assignment help. I get nice assignment help from the different trainers.

User ID: 4***45 Canberra, Australia

Psychology

Thesis: 1 Page, Deadline:
6 days

I am very satisfied the work your company does. My papers have never looked so good. Thank you very much for this.

User ID: 8***47 Canberra, Singapore

Management

Essay: 4.4 Pages, Deadline:
18 hours

Thank you for the effort of both the author and the expert. Thank you for the effort. The topic is excellent, and I hope to continue at this high leve …

User ID: 9***3 Canberra, Kuwait

HRM

Assignment: 3 Pages, Deadline:
4 days

very good paper, all aspects covered by the author, which was not easy given the scope

User ID: 3***61 Budapest, Hungary

Management

Home Work: 2 Pages, Deadline:
15 days

The information provided is very clear and used good examples and references. There are no grammatical mistakes at all and the standard of writing is …

User ID: 7***15 Berlin, Germany

Nursing

Assignment: 1 Page, Deadline:
5 days

Thank you for the quality assignment , I always trust and get good mark from assignment help . Thank you for helping .

User ID: 4***73 Berlin, Australia

Healthcare

Assignment: 8 Pages, Deadline:
3 days

I have passed my assignment. I got my assignment back on time. Thank you very much.

User ID: 6***42 Berlin, Australia

Psychology

Essay: 8 Pages, Deadline:
10 days

Good work. I will do business again. Great customer service. Greta expert…….

User ID: 8***51 Berlin, United Arab Emirates

Assignment

Home Work: 3 Pages, Deadline:
5 days

I love the work your company does. My papers have never looked so good. Thank you very much for this.

User ID: 5***00 Berlin, United States

Healthcare

Assignment: 4 Pages, Deadline:
5 days

I got a good grade on this paper thank you for help and I will order more papers

User ID: 7***29 Garden City, United States

Accounting

Programming: 2.8 Pages, Deadline:
3 days

like the work and the way of writing and the marks was good as i got 79 marks and i like the formate of the writing.

User ID: 6***31 Leichhardt, Australia

Management

Assignment: 7 Pages, Deadline:
5 days

the paper was well written and I passed thank you for your service I would pay again

User ID: 7***29 Garden City, United States

Economics

Assignment: 2 Pages, Deadline:
20 hours

The assignment was received on time and I have to check the assignment and will get back for the feedback thank you

User ID: 8***40 Vancouver, Canada

Accounting

Course Work: 0 Pages, Deadline:
11 hours

Everything was met and done perfectly! It hasn’t been graded but through the explanations, I was able to understand the assignment and how the expe …

User ID: 8***57 Vancouver, Canada

HRM

Assignment: 9 Pages, Deadline:
2 days

Got the great and satisfied result. Thank you the expert team. But the expert team need to write more real and work life example for each assignment i …

User ID: 4***0 Central District, Hong Kong

Management

Home Work: 9 Pages, Deadline:
2 days

Got the great and satisfied result for this assignment. Thank you the expert team, but give little suggest to the team, the assignment need to more re …

User ID: 4***0 Central District, Hong Kong

Management

Assignment: 8 Pages, Deadline:
8 hours

Really impressed by your work. the report really good. all the points are detailly explained and the report format looks great.

User ID: 4***87 Melbourne, Australia

Business Law

Assignment: 1 Page, Deadline:
20 days

Good work but there missing w6 answer. moreover i think few questions required explanation i think can explain little more.

User ID: 8***63 Australia

Civil Law

Home Work: 6 Pages, Deadline:
7 days

Assignment delivered before the deadline. Great work by the team and appreciate the support.

User ID: 8***13 Canberra, Australia

Economics

Assignment: 3.2 Pages, Deadline:
3 days

The assisgnment was properly written with great content and structure. Thank You for the assistance.

User ID: 1***22 Canberra, Singapore

Management

Home Work: 2 Pages, Deadline:
9 days

I am glad to use assignment help. I get nice assignment help from the different trainers.

User ID: 4***45 Canberra, Australia

Psychology

Thesis: 1 Page, Deadline:
6 days

I am very satisfied the work your company does. My papers have never looked so good. Thank you very much for this.

User ID: 8***47 Canberra, Singapore

Management

Essay: 4.4 Pages, Deadline:
18 hours

Thank you for the effort of both the author and the expert. Thank you for the effort. The topic is excellent, and I hope to continue at this high leve …

User ID: 9***3 Canberra, Kuwait

HRM

Assignment: 3 Pages, Deadline:
4 days

very good paper, all aspects covered by the author, which was not easy given the scope

User ID: 3***61 Budapest, Hungary

Management

Home Work: 2 Pages, Deadline:
15 days

The information provided is very clear and used good examples and references. There are no grammatical mistakes at all and the standard of writing is …

User ID: 7***15 Berlin, Germany

Nursing

Assignment: 1 Page, Deadline:
5 days

Thank you for the quality assignment , I always trust and get good mark from assignment help . Thank you for helping .

User ID: 4***73 Berlin, Australia

Healthcare

Assignment: 8 Pages, Deadline:
3 days

I have passed my assignment. I got my assignment back on time. Thank you very much.

User ID: 6***42 Berlin, Australia

Psychology

Essay: 8 Pages, Deadline:
10 days

Good work. I will do business again. Great customer service. Greta expert…….

User ID: 8***51 Berlin, United Arab Emirates

Assignment

Home Work: 3 Pages, Deadline:
5 days

I love the work your company does. My papers have never looked so good. Thank you very much for this.

User ID: 5***00 Berlin, United States

Healthcare

Assignment: 4 Pages, Deadline:
5 days

I got a good grade on this paper thank you for help and I will order more papers

User ID: 7***29 Garden City, United States

Accounting

Programming: 2.8 Pages, Deadline:
3 days

like the work and the way of writing and the marks was good as i got 79 marks and i like the formate of the writing.

User ID: 6***31 Leichhardt, Australia

Management

Assignment: 7 Pages, Deadline:
5 days

the paper was well written and I passed thank you for your service I would pay again

User ID: 7***29 Garden City, United States

Economics

Assignment: 2 Pages, Deadline:
20 hours

The assignment was received on time and I have to check the assignment and will get back for the feedback thank you

User ID: 8***40 Vancouver, Canada

Accounting

Course Work: 0 Pages, Deadline:
11 hours

Everything was met and done perfectly! It hasn’t been graded but through the explanations, I was able to understand the assignment and how the expe …

User ID: 8***57 Vancouver, Canada

HRM

Assignment: 9 Pages, Deadline:
2 days

Got the great and satisfied result. Thank you the expert team. But the expert team need to write more real and work life example for each assignment i …

User ID: 4***0 Central District, Hong Kong

Management

Home Work: 9 Pages, Deadline:
2 days

Got the great and satisfied result for this assignment. Thank you the expert team, but give little suggest to the team, the assignment need to more re …

User ID: 4***0 Central District, Hong Kong

Management

Assignment: 8 Pages, Deadline:
8 hours

Really impressed by your work. the report really good. all the points are detailly explained and the report format looks great.

User ID: 4***87 Melbourne, Australia

Have any Query?

Get Homework Help Online From Expert Tutors

X
WeCreativez WhatsApp Support
Our customer support team is here to answer your questions. Ask us anything!
👋 Hi, how can I help?