Learn more about the hadoop file system

Created: Oct 8, 2019 03:26:35Latest reply: Oct 9, 2019 07:20:08 799 2 0 0
  Rewarded Hi-coins: 0 (problem resolved)

I am very new to Bigdata and Haddop technologies. While understanding the architecture I have got a few quesitons below. Please help me to understand.


1) Who is a Client and HDFS architecture?

2) If my file is 128 mb then ideally it should devide into 2 blocak as 64mb each. But my question is where this file chopping/splitting will happen. Is it on client side. If so, how it will happen? Because I am trying to understand when I am sending 128 mb file to hdfs, how the splitting will happen. Please help me regarding the same.

3) Who are the compitetators for BigData.

4) What are the disadvantages with the BigData.


Thanks in advance.

  • x
  • convention:

Featured Answers
songminwang
Admin Created Oct 8, 2019 03:31:54 Helpful(3) Helpful(3)

Hello!
1) Client' is term used to refer Project owners,after all bigdata is evaluated for business improvement of client.'Hadoop architecture' signifies the set of rules and standards which forms core where every one needs to obey.Example:After constructing four pillars in zero level one cannot construct 6 pillars in next level.

2) First you have to understand the difference between block size and Split size. Both are different ideally. Block is the physical representation of data. Split is the logical representation of data present in Block. As Job startup input splits will be created. Based on input splite recrd reader will be created. Record reader responsibility is by getting reference from i/p splits & creates actual KV pair. All this will be created by InputFormat. I/p Split will useful to get complete record.
When user submits request client library will take that request, client library itself will creates i/p splits & other classes and gives complete details to resource manager.
Change replication factor Open the hdfs-site.xml file. This file is usually found in the conf/ folder of the Hadoop installation directory. Change or add the following property to hdfs-site.xml:
<property>

<name>dfs.replication<name>

<value>3<value>

<description>BlockReplication<description>

<property>
Hadoop Distributed File System was designed to hold and manage large amounts of data; therefore typical HDFS block sizes are significantly larger than the block sizes you would see for a traditional filesystem (for example, the filesystem on my laptop uses a block size of 4 KB). The block size setting is used by HDFS to divide files into blocks and then distribute those blocks across the cluster. For example, if a cluster is using a block size of 64 MB, and a 128-MB text file was put in to HDFS, HDFS would split the file into two blocks (128 MB/64 MB) and distribute the two chunks to the data nodes in the cluster.
Change block size. Open the hdfs-site.xml file. This file is usually found in the conf/ folder of the Hadoop installation directory.Set the following property in hdfs-site.xml:
<property>

   <name>dfs.block.size<name>

   <value>134217728<value>

   <description>Block size<description>

<property>

hdfs-site.xml is used to configure HDFS. Changing the dfs.block.size property in hdfs-site.xml will change the default block size for all the files placed into HDFS. In this case, we set the dfs.block.size to 128 MB. Changing this setting will not affect the block size of any files currently in HDFS. It will only affect the block size of files placed into HDFS after this setting has taken effect.
3) As of now there are no Competitors of bigdata ,Bigdata it self is used to understand the business drawbacks and improve business.Example:Amazon uses bigdata to understand user's choices and preferences,Twitter,Facebook,Linkedin,Netflix are social sites which use Bigdata.

4) Not good for many small files. Not fit for real time data processing. Potential stability issues Security.
  • x
  • convention:

All Answers
songminwang
songminwang Admin Created Oct 8, 2019 03:31:54 Helpful(3) Helpful(3)

Hello!
1) Client' is term used to refer Project owners,after all bigdata is evaluated for business improvement of client.'Hadoop architecture' signifies the set of rules and standards which forms core where every one needs to obey.Example:After constructing four pillars in zero level one cannot construct 6 pillars in next level.

2) First you have to understand the difference between block size and Split size. Both are different ideally. Block is the physical representation of data. Split is the logical representation of data present in Block. As Job startup input splits will be created. Based on input splite recrd reader will be created. Record reader responsibility is by getting reference from i/p splits & creates actual KV pair. All this will be created by InputFormat. I/p Split will useful to get complete record.
When user submits request client library will take that request, client library itself will creates i/p splits & other classes and gives complete details to resource manager.
Change replication factor Open the hdfs-site.xml file. This file is usually found in the conf/ folder of the Hadoop installation directory. Change or add the following property to hdfs-site.xml:
<property>

<name>dfs.replication<name>

<value>3<value>

<description>BlockReplication<description>

<property>
Hadoop Distributed File System was designed to hold and manage large amounts of data; therefore typical HDFS block sizes are significantly larger than the block sizes you would see for a traditional filesystem (for example, the filesystem on my laptop uses a block size of 4 KB). The block size setting is used by HDFS to divide files into blocks and then distribute those blocks across the cluster. For example, if a cluster is using a block size of 64 MB, and a 128-MB text file was put in to HDFS, HDFS would split the file into two blocks (128 MB/64 MB) and distribute the two chunks to the data nodes in the cluster.
Change block size. Open the hdfs-site.xml file. This file is usually found in the conf/ folder of the Hadoop installation directory.Set the following property in hdfs-site.xml:
<property>

   <name>dfs.block.size<name>

   <value>134217728<value>

   <description>Block size<description>

<property>

hdfs-site.xml is used to configure HDFS. Changing the dfs.block.size property in hdfs-site.xml will change the default block size for all the files placed into HDFS. In this case, we set the dfs.block.size to 128 MB. Changing this setting will not affect the block size of any files currently in HDFS. It will only affect the block size of files placed into HDFS after this setting has taken effect.
3) As of now there are no Competitors of bigdata ,Bigdata it self is used to understand the business drawbacks and improve business.Example:Amazon uses bigdata to understand user's choices and preferences,Twitter,Facebook,Linkedin,Netflix are social sites which use Bigdata.

4) Not good for many small files. Not fit for real time data processing. Potential stability issues Security.
  • x
  • convention:

little_fish
little_fish Admin Created Oct 9, 2019 07:20:08 Helpful(0) Helpful(0)

Posted by songminwang at 2019-10-08 03:31 Hello!1) Client' is term used to refer Project owners,after all bigdata is evaluated for business im ...
Thanks.songmingwang.
  • x
  • convention:

Comment

Reply
You need to log in to reply to the post Login | Register

Notice Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " Privacy."
If the attachment button is not available, update the Adobe Flash Player to the latest version!
Login and enjoy all the member benefits

Login and enjoy all the member benefits

Login