OS : LINUX ubuntu server 9.10
將hadoop-0.18.3升級到hadoop-0.20.1
cd /opt/hadoop
bin/start-dfs.sh
bin/start-mapred.sh
cd /opt/
wget http://ftp.twaren.net/Unix/Web/apache/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz
tar -zxvf hadoop-0.20.1.tar.gz
mv hadoop hadoop-0.18.3
mv hadoop-0.20.1 hadoop
cd /opt/hadoop
要在 /opt/hadoop/conf/hadoop-env.sh裡增加
export JAVA_HOME=/usr/lib/jvm/java-6-sun
export HADOOP_HOME=/opt/hadoop
export HADOOP_CONF_DIR=/opt/hadoop/conf
export HADOOP_LOG_DIR=/tmp/hadoop/logs
export HADOOP_PID_DIR=/tmp/hadoop/pids
原來的hadoop-site.xml被分成了三個
可以參考官方文件 看看每個文件可以放那些屬性
http://hadoop.apache.org/common/docs/current/core-default.html
http://hadoop.apache.org/common/docs/current/hdfs-default.html
http://hadoop.apache.org/common/docs/current/mapred-default.html
依我的設定
hdfs-site.xml
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/tmp/hadoop/hadoop-${user.name}/name</value>
<description></description>
</property>
<property>
<name>dfs.data.dir</name>
<value>/tmp/hadoop/hadoop-${user.name}/data</value>
<description></description>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
<description></description>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>192.168.1.4:9001</value>
<description> </description>
</property>
</configuration>
core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp/hadoop/hadoop-${user.name}</value>
<description> </description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://192.168.1.4:9000</value>
<description> </description>
</property>
</configuration>
並且要更改/opt/hadoop/conf/masters slaves 兩個檔案
masters為namenode與jobstracker的電腦IP
slaves為有datanode與tasktracker的電腦IP
執行
bin/start-dfs.sh -upgrade
看版本
看目前dfs的狀態
bin/hadoop dfsadmin -report
資料不重要或著可以刪除 就format
bin/hadoop namenode -format
跑跑看
bin/hadoop jar hadoop-*-examples.jar wordcount /input /output
要放一些東西到/intput 不過要用bin/hadoop fs -mkdir 建立
留言列表