Prioritising Local Changes












0














I'm scraping pages from a website, munging them, then compiling them into a ebook. I'm using Git for both the code and the HTML content.



I have to make manual edits to some pages, and they're often updated upstream. This leaves me with the problem of how to retain my local edits when the site updates.



For example, I download v1 of page A, I delete an invalid "", and commit my changes; later I download v2 of page A, which has new content, but still features "". I want to merge the new content into my copy of page A, but also apply my local changes.



I suspect I'll need to manually resolve conflicts sometimes, but on the whole this should be automatic.



I've experimented with merge strategies, rebasing, and other approaches to no avail. What am I missing?



EDIT:



To help clarify my problem:



git init
wget -O page.html https://example.com/
git add page.html
git commit -a -m "w0"
git checkout -b ebook
sed -i -e 's/http:/https:/' page.html
git commit -a -m "e1"
git checkout master
git merge ebook
wget -O - https://example.com/ | sed -e 's/may/may not/' > page.html
git commit -a -m w1
git checkout ebook
git merge master


At the end the last local edit is preserved but the first lost. I know I'm doing something stupid, but...










share|improve this question





























    0














    I'm scraping pages from a website, munging them, then compiling them into a ebook. I'm using Git for both the code and the HTML content.



    I have to make manual edits to some pages, and they're often updated upstream. This leaves me with the problem of how to retain my local edits when the site updates.



    For example, I download v1 of page A, I delete an invalid "", and commit my changes; later I download v2 of page A, which has new content, but still features "". I want to merge the new content into my copy of page A, but also apply my local changes.



    I suspect I'll need to manually resolve conflicts sometimes, but on the whole this should be automatic.



    I've experimented with merge strategies, rebasing, and other approaches to no avail. What am I missing?



    EDIT:



    To help clarify my problem:



    git init
    wget -O page.html https://example.com/
    git add page.html
    git commit -a -m "w0"
    git checkout -b ebook
    sed -i -e 's/http:/https:/' page.html
    git commit -a -m "e1"
    git checkout master
    git merge ebook
    wget -O - https://example.com/ | sed -e 's/may/may not/' > page.html
    git commit -a -m w1
    git checkout ebook
    git merge master


    At the end the last local edit is preserved but the first lost. I know I'm doing something stupid, but...










    share|improve this question



























      0












      0








      0







      I'm scraping pages from a website, munging them, then compiling them into a ebook. I'm using Git for both the code and the HTML content.



      I have to make manual edits to some pages, and they're often updated upstream. This leaves me with the problem of how to retain my local edits when the site updates.



      For example, I download v1 of page A, I delete an invalid "", and commit my changes; later I download v2 of page A, which has new content, but still features "". I want to merge the new content into my copy of page A, but also apply my local changes.



      I suspect I'll need to manually resolve conflicts sometimes, but on the whole this should be automatic.



      I've experimented with merge strategies, rebasing, and other approaches to no avail. What am I missing?



      EDIT:



      To help clarify my problem:



      git init
      wget -O page.html https://example.com/
      git add page.html
      git commit -a -m "w0"
      git checkout -b ebook
      sed -i -e 's/http:/https:/' page.html
      git commit -a -m "e1"
      git checkout master
      git merge ebook
      wget -O - https://example.com/ | sed -e 's/may/may not/' > page.html
      git commit -a -m w1
      git checkout ebook
      git merge master


      At the end the last local edit is preserved but the first lost. I know I'm doing something stupid, but...










      share|improve this question















      I'm scraping pages from a website, munging them, then compiling them into a ebook. I'm using Git for both the code and the HTML content.



      I have to make manual edits to some pages, and they're often updated upstream. This leaves me with the problem of how to retain my local edits when the site updates.



      For example, I download v1 of page A, I delete an invalid "", and commit my changes; later I download v2 of page A, which has new content, but still features "". I want to merge the new content into my copy of page A, but also apply my local changes.



      I suspect I'll need to manually resolve conflicts sometimes, but on the whole this should be automatic.



      I've experimented with merge strategies, rebasing, and other approaches to no avail. What am I missing?



      EDIT:



      To help clarify my problem:



      git init
      wget -O page.html https://example.com/
      git add page.html
      git commit -a -m "w0"
      git checkout -b ebook
      sed -i -e 's/http:/https:/' page.html
      git commit -a -m "e1"
      git checkout master
      git merge ebook
      wget -O - https://example.com/ | sed -e 's/may/may not/' > page.html
      git commit -a -m w1
      git checkout ebook
      git merge master


      At the end the last local edit is preserved but the first lost. I know I'm doing something stupid, but...







      git git-merge epub epub3






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 23 '18 at 20:20









      isherwood

      36.6k1081111




      36.6k1081111










      asked Nov 22 '18 at 20:35









      A. Zed

      32




      32
























          1 Answer
          1






          active

          oldest

          votes


















          0














          I would maintain a branch that tracks the original web pages only, let's call it web. Every time you download an update, commit it to the web branch. Then you need a ebook branch for your changes. After updating the web branch, merge it into your ebook branch, resolving any conflicts that arise. ebook is initially created as a branch off of the initial web.



          Scenario: Let's assume you started with W0 as the initial state on the web server, then you made local changes in commits E1 and E2. Then the web server was updated to W1, which you merge in to ebook to get E3.



          That would give you a history that looks like this:



          W0 -------- W1    (web branch)

          E1 - E2 --- E3 (ebook branch)


          When you download the next update to web, W2, you'll get this commit graph, assuming you also had E4 as additional reformatting changes required because of W1:



          W0 -------- W1 -------- W2    (web branch)

          E1 - E2 --- E3 - E4 --- E5 (ebook branch)


          When you merge W2 into E4 to get E5, Git should apply only the changes between W1 and W2 to E4, which should do what you want.



          Note: this process only ever merges from web into ebook, never from ebook into web. Merging from ebook back into web would undo the desired effect, as discussed in the comments below this answer.






          share|improve this answer























          • This is pretty much what I tried first. The problem being that after merging L1-3 for O1, when I update to get O3, I cannot merge the same commits in again. So, on the second update my local changes get overridden. Does that make sense?
            – A. Zed
            Nov 22 '18 at 21:03










          • It should work when you iterate. I'll add a second loop to my answer.
            – joanis
            Nov 22 '18 at 21:04












          • Hmmm, that's what I tried. But because the web branch always has the error that needs correcting, and the commit that fixes that has already been applied, Git feels the branches are up-to-date. Maybe I need to put together a minimal test case.
            – A. Zed
            Nov 22 '18 at 21:16










          • I can't get my example to format in the comments, so it's here: pastebin.com/csY3ZeWZ . I know I'm doing something stupid, but can't see what...
            – A. Zed
            Nov 22 '18 at 22:05










          • Hum, I'll try to look at your link from another machine, but it's not loading. It might be blocked by my corporate firewall.
            – joanis
            Nov 23 '18 at 17:21











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53437709%2fprioritising-local-changes%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0














          I would maintain a branch that tracks the original web pages only, let's call it web. Every time you download an update, commit it to the web branch. Then you need a ebook branch for your changes. After updating the web branch, merge it into your ebook branch, resolving any conflicts that arise. ebook is initially created as a branch off of the initial web.



          Scenario: Let's assume you started with W0 as the initial state on the web server, then you made local changes in commits E1 and E2. Then the web server was updated to W1, which you merge in to ebook to get E3.



          That would give you a history that looks like this:



          W0 -------- W1    (web branch)

          E1 - E2 --- E3 (ebook branch)


          When you download the next update to web, W2, you'll get this commit graph, assuming you also had E4 as additional reformatting changes required because of W1:



          W0 -------- W1 -------- W2    (web branch)

          E1 - E2 --- E3 - E4 --- E5 (ebook branch)


          When you merge W2 into E4 to get E5, Git should apply only the changes between W1 and W2 to E4, which should do what you want.



          Note: this process only ever merges from web into ebook, never from ebook into web. Merging from ebook back into web would undo the desired effect, as discussed in the comments below this answer.






          share|improve this answer























          • This is pretty much what I tried first. The problem being that after merging L1-3 for O1, when I update to get O3, I cannot merge the same commits in again. So, on the second update my local changes get overridden. Does that make sense?
            – A. Zed
            Nov 22 '18 at 21:03










          • It should work when you iterate. I'll add a second loop to my answer.
            – joanis
            Nov 22 '18 at 21:04












          • Hmmm, that's what I tried. But because the web branch always has the error that needs correcting, and the commit that fixes that has already been applied, Git feels the branches are up-to-date. Maybe I need to put together a minimal test case.
            – A. Zed
            Nov 22 '18 at 21:16










          • I can't get my example to format in the comments, so it's here: pastebin.com/csY3ZeWZ . I know I'm doing something stupid, but can't see what...
            – A. Zed
            Nov 22 '18 at 22:05










          • Hum, I'll try to look at your link from another machine, but it's not loading. It might be blocked by my corporate firewall.
            – joanis
            Nov 23 '18 at 17:21
















          0














          I would maintain a branch that tracks the original web pages only, let's call it web. Every time you download an update, commit it to the web branch. Then you need a ebook branch for your changes. After updating the web branch, merge it into your ebook branch, resolving any conflicts that arise. ebook is initially created as a branch off of the initial web.



          Scenario: Let's assume you started with W0 as the initial state on the web server, then you made local changes in commits E1 and E2. Then the web server was updated to W1, which you merge in to ebook to get E3.



          That would give you a history that looks like this:



          W0 -------- W1    (web branch)

          E1 - E2 --- E3 (ebook branch)


          When you download the next update to web, W2, you'll get this commit graph, assuming you also had E4 as additional reformatting changes required because of W1:



          W0 -------- W1 -------- W2    (web branch)

          E1 - E2 --- E3 - E4 --- E5 (ebook branch)


          When you merge W2 into E4 to get E5, Git should apply only the changes between W1 and W2 to E4, which should do what you want.



          Note: this process only ever merges from web into ebook, never from ebook into web. Merging from ebook back into web would undo the desired effect, as discussed in the comments below this answer.






          share|improve this answer























          • This is pretty much what I tried first. The problem being that after merging L1-3 for O1, when I update to get O3, I cannot merge the same commits in again. So, on the second update my local changes get overridden. Does that make sense?
            – A. Zed
            Nov 22 '18 at 21:03










          • It should work when you iterate. I'll add a second loop to my answer.
            – joanis
            Nov 22 '18 at 21:04












          • Hmmm, that's what I tried. But because the web branch always has the error that needs correcting, and the commit that fixes that has already been applied, Git feels the branches are up-to-date. Maybe I need to put together a minimal test case.
            – A. Zed
            Nov 22 '18 at 21:16










          • I can't get my example to format in the comments, so it's here: pastebin.com/csY3ZeWZ . I know I'm doing something stupid, but can't see what...
            – A. Zed
            Nov 22 '18 at 22:05










          • Hum, I'll try to look at your link from another machine, but it's not loading. It might be blocked by my corporate firewall.
            – joanis
            Nov 23 '18 at 17:21














          0












          0








          0






          I would maintain a branch that tracks the original web pages only, let's call it web. Every time you download an update, commit it to the web branch. Then you need a ebook branch for your changes. After updating the web branch, merge it into your ebook branch, resolving any conflicts that arise. ebook is initially created as a branch off of the initial web.



          Scenario: Let's assume you started with W0 as the initial state on the web server, then you made local changes in commits E1 and E2. Then the web server was updated to W1, which you merge in to ebook to get E3.



          That would give you a history that looks like this:



          W0 -------- W1    (web branch)

          E1 - E2 --- E3 (ebook branch)


          When you download the next update to web, W2, you'll get this commit graph, assuming you also had E4 as additional reformatting changes required because of W1:



          W0 -------- W1 -------- W2    (web branch)

          E1 - E2 --- E3 - E4 --- E5 (ebook branch)


          When you merge W2 into E4 to get E5, Git should apply only the changes between W1 and W2 to E4, which should do what you want.



          Note: this process only ever merges from web into ebook, never from ebook into web. Merging from ebook back into web would undo the desired effect, as discussed in the comments below this answer.






          share|improve this answer














          I would maintain a branch that tracks the original web pages only, let's call it web. Every time you download an update, commit it to the web branch. Then you need a ebook branch for your changes. After updating the web branch, merge it into your ebook branch, resolving any conflicts that arise. ebook is initially created as a branch off of the initial web.



          Scenario: Let's assume you started with W0 as the initial state on the web server, then you made local changes in commits E1 and E2. Then the web server was updated to W1, which you merge in to ebook to get E3.



          That would give you a history that looks like this:



          W0 -------- W1    (web branch)

          E1 - E2 --- E3 (ebook branch)


          When you download the next update to web, W2, you'll get this commit graph, assuming you also had E4 as additional reformatting changes required because of W1:



          W0 -------- W1 -------- W2    (web branch)

          E1 - E2 --- E3 - E4 --- E5 (ebook branch)


          When you merge W2 into E4 to get E5, Git should apply only the changes between W1 and W2 to E4, which should do what you want.



          Note: this process only ever merges from web into ebook, never from ebook into web. Merging from ebook back into web would undo the desired effect, as discussed in the comments below this answer.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 23 '18 at 20:31

























          answered Nov 22 '18 at 20:58









          joanis

          37918




          37918












          • This is pretty much what I tried first. The problem being that after merging L1-3 for O1, when I update to get O3, I cannot merge the same commits in again. So, on the second update my local changes get overridden. Does that make sense?
            – A. Zed
            Nov 22 '18 at 21:03










          • It should work when you iterate. I'll add a second loop to my answer.
            – joanis
            Nov 22 '18 at 21:04












          • Hmmm, that's what I tried. But because the web branch always has the error that needs correcting, and the commit that fixes that has already been applied, Git feels the branches are up-to-date. Maybe I need to put together a minimal test case.
            – A. Zed
            Nov 22 '18 at 21:16










          • I can't get my example to format in the comments, so it's here: pastebin.com/csY3ZeWZ . I know I'm doing something stupid, but can't see what...
            – A. Zed
            Nov 22 '18 at 22:05










          • Hum, I'll try to look at your link from another machine, but it's not loading. It might be blocked by my corporate firewall.
            – joanis
            Nov 23 '18 at 17:21


















          • This is pretty much what I tried first. The problem being that after merging L1-3 for O1, when I update to get O3, I cannot merge the same commits in again. So, on the second update my local changes get overridden. Does that make sense?
            – A. Zed
            Nov 22 '18 at 21:03










          • It should work when you iterate. I'll add a second loop to my answer.
            – joanis
            Nov 22 '18 at 21:04












          • Hmmm, that's what I tried. But because the web branch always has the error that needs correcting, and the commit that fixes that has already been applied, Git feels the branches are up-to-date. Maybe I need to put together a minimal test case.
            – A. Zed
            Nov 22 '18 at 21:16










          • I can't get my example to format in the comments, so it's here: pastebin.com/csY3ZeWZ . I know I'm doing something stupid, but can't see what...
            – A. Zed
            Nov 22 '18 at 22:05










          • Hum, I'll try to look at your link from another machine, but it's not loading. It might be blocked by my corporate firewall.
            – joanis
            Nov 23 '18 at 17:21
















          This is pretty much what I tried first. The problem being that after merging L1-3 for O1, when I update to get O3, I cannot merge the same commits in again. So, on the second update my local changes get overridden. Does that make sense?
          – A. Zed
          Nov 22 '18 at 21:03




          This is pretty much what I tried first. The problem being that after merging L1-3 for O1, when I update to get O3, I cannot merge the same commits in again. So, on the second update my local changes get overridden. Does that make sense?
          – A. Zed
          Nov 22 '18 at 21:03












          It should work when you iterate. I'll add a second loop to my answer.
          – joanis
          Nov 22 '18 at 21:04






          It should work when you iterate. I'll add a second loop to my answer.
          – joanis
          Nov 22 '18 at 21:04














          Hmmm, that's what I tried. But because the web branch always has the error that needs correcting, and the commit that fixes that has already been applied, Git feels the branches are up-to-date. Maybe I need to put together a minimal test case.
          – A. Zed
          Nov 22 '18 at 21:16




          Hmmm, that's what I tried. But because the web branch always has the error that needs correcting, and the commit that fixes that has already been applied, Git feels the branches are up-to-date. Maybe I need to put together a minimal test case.
          – A. Zed
          Nov 22 '18 at 21:16












          I can't get my example to format in the comments, so it's here: pastebin.com/csY3ZeWZ . I know I'm doing something stupid, but can't see what...
          – A. Zed
          Nov 22 '18 at 22:05




          I can't get my example to format in the comments, so it's here: pastebin.com/csY3ZeWZ . I know I'm doing something stupid, but can't see what...
          – A. Zed
          Nov 22 '18 at 22:05












          Hum, I'll try to look at your link from another machine, but it's not loading. It might be blocked by my corporate firewall.
          – joanis
          Nov 23 '18 at 17:21




          Hum, I'll try to look at your link from another machine, but it's not loading. It might be blocked by my corporate firewall.
          – joanis
          Nov 23 '18 at 17:21


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.





          Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


          Please pay close attention to the following guidance:


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53437709%2fprioritising-local-changes%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Berounka

          Different font size/position of beamer's navigation symbols template's content depending on regular/plain...

          Sphinx de Gizeh