The necessity to limit VARCHAR length for indexed columns using coallation utf8_unicode_ci











up vote
0
down vote

favorite












I am trying to understand a MySQL error and it is related to some defaults Rails generates with ActiveRecord migrations. Given this:



rails generate migration AddDetailsToProducts supplier:index:references{polymorphic}

class AddDetailsToProducts < ActiveRecord::Migration
def change
add_reference :products, :supplier, polymorphic: true, index: true, foreign_key: true
end
end


So what does this do? First, let's look at polymorphic. Polymorphic will create a supplier_type VARCHAR(255) and supplier_id INT column in products. I believe the VARCHAR is set to 256 characters, since early versions of MySQL did not support more. But remember that VARCHAR columns in the database are variable length, so there's no storage advantage to a ten character value in a VARCHAR(255) versus a VARCHAR(20).



references adds the supplier_type and supplier_id as foreign keys of products to the suppliers primary key. A FOREIGN KEY is a field (or collection of fields) in one table that refers to the PRIMARY KEY in another table. It is a key used to link two tables together.



So I think "add_reference" does something like this:



CREATE TABLE products (
PRIMARY KEY (id),
FOREIGN KEY (supplier_id) REFERENCES suppliers(id)
FOREIGN KEY (supplier_type) REFERENCES suppliers(id)
)
CREATE INDEX `index_suppliers_on_supplier_type` ON `suppliers` (`supplier_type`)
CREATE INDEX `index_suppliers_on_supplier_id` ON `suppliers` (`supplier_id`)


Now I received an error like this:




Specified key was too long; max key length is 767 bytes: CREATE INDEX
index_suppliers_on_supplier_type ON suppliers (supplier_type)




So what we have is a supplier_type column which is VARCHAR(255) and we attempted to place an index on it. I am using the utf8_unicode_ci coallation. My understanding is this uses 1 to 3 bytes per character. So even if this was to use 3 bytes for all characters with a maximum of 256 characters, that is 256 * 3 = 768. One byte over. It really doesn't make sense. Is the solution really just to add a limit to the maximum character size for the column? Am I understanding this correctly?



Because when I do this, the error goes away:



class ChangeSuppliers < ActiveRecord::Migration
def change
change_column :suppliers, :supplier_type, :string, limit: 191
end
end









share|improve this question


























    up vote
    0
    down vote

    favorite












    I am trying to understand a MySQL error and it is related to some defaults Rails generates with ActiveRecord migrations. Given this:



    rails generate migration AddDetailsToProducts supplier:index:references{polymorphic}

    class AddDetailsToProducts < ActiveRecord::Migration
    def change
    add_reference :products, :supplier, polymorphic: true, index: true, foreign_key: true
    end
    end


    So what does this do? First, let's look at polymorphic. Polymorphic will create a supplier_type VARCHAR(255) and supplier_id INT column in products. I believe the VARCHAR is set to 256 characters, since early versions of MySQL did not support more. But remember that VARCHAR columns in the database are variable length, so there's no storage advantage to a ten character value in a VARCHAR(255) versus a VARCHAR(20).



    references adds the supplier_type and supplier_id as foreign keys of products to the suppliers primary key. A FOREIGN KEY is a field (or collection of fields) in one table that refers to the PRIMARY KEY in another table. It is a key used to link two tables together.



    So I think "add_reference" does something like this:



    CREATE TABLE products (
    PRIMARY KEY (id),
    FOREIGN KEY (supplier_id) REFERENCES suppliers(id)
    FOREIGN KEY (supplier_type) REFERENCES suppliers(id)
    )
    CREATE INDEX `index_suppliers_on_supplier_type` ON `suppliers` (`supplier_type`)
    CREATE INDEX `index_suppliers_on_supplier_id` ON `suppliers` (`supplier_id`)


    Now I received an error like this:




    Specified key was too long; max key length is 767 bytes: CREATE INDEX
    index_suppliers_on_supplier_type ON suppliers (supplier_type)




    So what we have is a supplier_type column which is VARCHAR(255) and we attempted to place an index on it. I am using the utf8_unicode_ci coallation. My understanding is this uses 1 to 3 bytes per character. So even if this was to use 3 bytes for all characters with a maximum of 256 characters, that is 256 * 3 = 768. One byte over. It really doesn't make sense. Is the solution really just to add a limit to the maximum character size for the column? Am I understanding this correctly?



    Because when I do this, the error goes away:



    class ChangeSuppliers < ActiveRecord::Migration
    def change
    change_column :suppliers, :supplier_type, :string, limit: 191
    end
    end









    share|improve this question
























      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      I am trying to understand a MySQL error and it is related to some defaults Rails generates with ActiveRecord migrations. Given this:



      rails generate migration AddDetailsToProducts supplier:index:references{polymorphic}

      class AddDetailsToProducts < ActiveRecord::Migration
      def change
      add_reference :products, :supplier, polymorphic: true, index: true, foreign_key: true
      end
      end


      So what does this do? First, let's look at polymorphic. Polymorphic will create a supplier_type VARCHAR(255) and supplier_id INT column in products. I believe the VARCHAR is set to 256 characters, since early versions of MySQL did not support more. But remember that VARCHAR columns in the database are variable length, so there's no storage advantage to a ten character value in a VARCHAR(255) versus a VARCHAR(20).



      references adds the supplier_type and supplier_id as foreign keys of products to the suppliers primary key. A FOREIGN KEY is a field (or collection of fields) in one table that refers to the PRIMARY KEY in another table. It is a key used to link two tables together.



      So I think "add_reference" does something like this:



      CREATE TABLE products (
      PRIMARY KEY (id),
      FOREIGN KEY (supplier_id) REFERENCES suppliers(id)
      FOREIGN KEY (supplier_type) REFERENCES suppliers(id)
      )
      CREATE INDEX `index_suppliers_on_supplier_type` ON `suppliers` (`supplier_type`)
      CREATE INDEX `index_suppliers_on_supplier_id` ON `suppliers` (`supplier_id`)


      Now I received an error like this:




      Specified key was too long; max key length is 767 bytes: CREATE INDEX
      index_suppliers_on_supplier_type ON suppliers (supplier_type)




      So what we have is a supplier_type column which is VARCHAR(255) and we attempted to place an index on it. I am using the utf8_unicode_ci coallation. My understanding is this uses 1 to 3 bytes per character. So even if this was to use 3 bytes for all characters with a maximum of 256 characters, that is 256 * 3 = 768. One byte over. It really doesn't make sense. Is the solution really just to add a limit to the maximum character size for the column? Am I understanding this correctly?



      Because when I do this, the error goes away:



      class ChangeSuppliers < ActiveRecord::Migration
      def change
      change_column :suppliers, :supplier_type, :string, limit: 191
      end
      end









      share|improve this question













      I am trying to understand a MySQL error and it is related to some defaults Rails generates with ActiveRecord migrations. Given this:



      rails generate migration AddDetailsToProducts supplier:index:references{polymorphic}

      class AddDetailsToProducts < ActiveRecord::Migration
      def change
      add_reference :products, :supplier, polymorphic: true, index: true, foreign_key: true
      end
      end


      So what does this do? First, let's look at polymorphic. Polymorphic will create a supplier_type VARCHAR(255) and supplier_id INT column in products. I believe the VARCHAR is set to 256 characters, since early versions of MySQL did not support more. But remember that VARCHAR columns in the database are variable length, so there's no storage advantage to a ten character value in a VARCHAR(255) versus a VARCHAR(20).



      references adds the supplier_type and supplier_id as foreign keys of products to the suppliers primary key. A FOREIGN KEY is a field (or collection of fields) in one table that refers to the PRIMARY KEY in another table. It is a key used to link two tables together.



      So I think "add_reference" does something like this:



      CREATE TABLE products (
      PRIMARY KEY (id),
      FOREIGN KEY (supplier_id) REFERENCES suppliers(id)
      FOREIGN KEY (supplier_type) REFERENCES suppliers(id)
      )
      CREATE INDEX `index_suppliers_on_supplier_type` ON `suppliers` (`supplier_type`)
      CREATE INDEX `index_suppliers_on_supplier_id` ON `suppliers` (`supplier_id`)


      Now I received an error like this:




      Specified key was too long; max key length is 767 bytes: CREATE INDEX
      index_suppliers_on_supplier_type ON suppliers (supplier_type)




      So what we have is a supplier_type column which is VARCHAR(255) and we attempted to place an index on it. I am using the utf8_unicode_ci coallation. My understanding is this uses 1 to 3 bytes per character. So even if this was to use 3 bytes for all characters with a maximum of 256 characters, that is 256 * 3 = 768. One byte over. It really doesn't make sense. Is the solution really just to add a limit to the maximum character size for the column? Am I understanding this correctly?



      Because when I do this, the error goes away:



      class ChangeSuppliers < ActiveRecord::Migration
      def change
      change_column :suppliers, :supplier_type, :string, limit: 191
      end
      end






      mysql ruby-on-rails database






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked 11 hours ago









      Donato

      1,78621228




      1,78621228
























          1 Answer
          1






          active

          oldest

          votes

















          up vote
          0
          down vote













          The collation is just the order, the real cause is the character set. This answer shows a couple of work-arounds. An index limit like you've done it also works.



          Its been a very long time (MySQL-4.0) since 255 what a varchar limit - its rather and arbitrary choice and a choice of limit based on your data is the best approach.



          In addition to index sizes being larger for larger lengths, joins involving these sometimes use the MEMORY storage engine result in varchar(X) being converted to char(X) which can use a lot more memory.






          share|improve this answer





















            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














             

            draft saved


            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53400663%2fthe-necessity-to-limit-varchar-length-for-indexed-columns-using-coallation-utf8%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            0
            down vote













            The collation is just the order, the real cause is the character set. This answer shows a couple of work-arounds. An index limit like you've done it also works.



            Its been a very long time (MySQL-4.0) since 255 what a varchar limit - its rather and arbitrary choice and a choice of limit based on your data is the best approach.



            In addition to index sizes being larger for larger lengths, joins involving these sometimes use the MEMORY storage engine result in varchar(X) being converted to char(X) which can use a lot more memory.






            share|improve this answer

























              up vote
              0
              down vote













              The collation is just the order, the real cause is the character set. This answer shows a couple of work-arounds. An index limit like you've done it also works.



              Its been a very long time (MySQL-4.0) since 255 what a varchar limit - its rather and arbitrary choice and a choice of limit based on your data is the best approach.



              In addition to index sizes being larger for larger lengths, joins involving these sometimes use the MEMORY storage engine result in varchar(X) being converted to char(X) which can use a lot more memory.






              share|improve this answer























                up vote
                0
                down vote










                up vote
                0
                down vote









                The collation is just the order, the real cause is the character set. This answer shows a couple of work-arounds. An index limit like you've done it also works.



                Its been a very long time (MySQL-4.0) since 255 what a varchar limit - its rather and arbitrary choice and a choice of limit based on your data is the best approach.



                In addition to index sizes being larger for larger lengths, joins involving these sometimes use the MEMORY storage engine result in varchar(X) being converted to char(X) which can use a lot more memory.






                share|improve this answer












                The collation is just the order, the real cause is the character set. This answer shows a couple of work-arounds. An index limit like you've done it also works.



                Its been a very long time (MySQL-4.0) since 255 what a varchar limit - its rather and arbitrary choice and a choice of limit based on your data is the best approach.



                In addition to index sizes being larger for larger lengths, joins involving these sometimes use the MEMORY storage engine result in varchar(X) being converted to char(X) which can use a lot more memory.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered 9 hours ago









                danblack

                959213




                959213






























                     

                    draft saved


                    draft discarded



















































                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53400663%2fthe-necessity-to-limit-varchar-length-for-indexed-columns-using-coallation-utf8%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Berounka

                    Sphinx de Gizeh

                    Different font size/position of beamer's navigation symbols template's content depending on regular/plain...