Home » Machine Learning/Artificial Intelligence

Attribute relation file format | Machine Learning

In this article, we will be looking at the use of attribute relation file format for machine learning in java.
Submitted by Raunak Goswami, on September 01, 2018

Today, we will be looking at the use of attribute relation file format for machine learning in java and we would be writing a small java code to convert the popularly used .csv file format into the arff (Attribute relation file format). This file format was developed by the computer science department of the University of Waikato, as the name suggests the file contains a list of attributes and one class attribute. The attribute relation file format is broadly divided into two portions:

  1. Header field
  2. Data field

Now, we would be discussing these fields in detail,

1) Header field

The header field describes the name of the attributes, type of relation and their datatypes that are present in the data file the main difference between them .CSV and .arff file are that the in .CSV files you will find the values of the attributes just below their name but in .arff files, the name of the attributes are specified separately followed by the data which is present in a separate data field. The basic syntax for writing the attribute name In the header portion is as follows:

 @attribute <attribute-name> <datatype>

The image below shows an example of .arff file format,

relational headerbrain

The following example is a data set contains the head-brain relation of the various users. From the picture above one can easily identify the number of attributes along with the type of data that they contain in our example all the data in all four attributes are in the form of number i.e. numeric. Apart from being numeric, the data type can be of the form of nominal, string type and data type specification.

2) Data field

This field contains the data values of the attributes mentioned above in the attribute field these are the values will be used by our model to perform prediction and to determine the amount of accuracy that can be provided in the result of our model. The data present is separated by the comas under the heading of @data. The data as mentioned above in the attributes field can be as follows:

  1. Numerical
  2. Nominal
  3. String
  4. Date-time format

The .CSV file, that I have used can be downloaded from here: headbrain7.csv

Below is the code is written in Java in eclipse IDE for converting the .CSV file into .arff file format make sure you have set the path to the weka.jar file if you haven’t, then just have a look at my previous article: Introduction to weka and Machine learning in Java


Code:

import java.io.File;
import java.io.IOException;

import weka.*;
import weka.core.Instances;
import weka.core.converters.ArffSaver;
import weka.core.converters.CSVLoader;

public class wekaapi {
	
	public static void main(String[] args) throws IOException {
		// load the CSV file
		CSVLoader load = new CSVLoader();
		loader.setSource(new File("C:\\Users\\Logan\\Desktop\\ML\\linearregression\\headbrain.csv"));
		Instances data = load.getDataSet();//get instances object



		ArffSaver save = new ArffSaver();
		save.setInstances(data);//set the dataset we want to convert

		save.setFile(new File("C:\\Users\\Logan\\Desktop\\ML\\headbrain.arff"));
		System.out.println("The .arff file format is as follows");
		save.writeBatch();
		System.out.println(data);
	}
	
}

Output

Attribute relation file format output

Clean display and proper orientation of data make .arff files a popular choice among the data scientists for their analysis this was all for today guys, Hope you liked this article and stay tuned for more and have a great day ahead.






Comments and Discussions

Ad: Are you a blogger? Join our Blogging forum.
Learn PCB Designing: PCB DESIGNING TUTORIAL







Languages: » C » C++ » C++ STL » Java » Data Structure » C#.Net » Android » Kotlin » SQL
Web Technologies: » PHP » Python » JavaScript » CSS » Ajax » Node.js » Web programming/HTML
Solved programs: » C » C++ » DS » Java » C#
Aptitude que. & ans.: » C » C++ » Java » DBMS
Interview que. & ans.: » C » Embedded C » Java » SEO » HR
CS Subjects: » CS Basics » O.S. » Networks » DBMS » Embedded Systems » Cloud Computing » Machine learning » CS Organizations » Linux » DOS
More: » Articles » Puzzles » News/Updates

© https://www.includehelp.com some rights reserved.